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LAMP: 
System Description 


By H. Y. CHANG, G. W. SMITH, Jr., and R. B. WALFORD 
(Manuscript received February 28, 1974) 


A general description of the Logic Analyzer for Maintenance Planning 
(LAMP) system is presented. LAMP is a digital-logic simulation and 
analysis system used for logic-design verification, for generation and 
evaluation of fault-detection and diagnostic tests, and for generation of the 
trouble-location manual (or fault dictionary) data. It is implemented on the 
IBM 360/870 TSS and OS machines (for both interactive and batch 
operations), and has been in active use at Bell Laboratories in the develop- 
ment of electronic switching systems, data set facilities, transmission 
equipment, and advanced integrated circust technologies. 


I. INTRODUCTION 


The explosive evolution of digital devices, computers, and systems 
since the invention of the transistor has necessitated a parallel industry- 
wide development of tools for the design and test of logic circuits. 
Whereas the oscilloscope was the mainstay of the digital circuit de- 
signer in the early days of discrete-transistor logic circuits, it soon 
proved to be inadequate for design verification and fault-behavior 
testing of large systems employing integrated, digital logic. In response 
to this need for better logic-circuit-development tools, a host of digital- 
simulator algorithms and simulator systems has been produced.!~* 
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The need for reliable and dependable electronic switching systems 
(ESS) poses critical design problems. Computer-aided techniques can 
be used effectively for: 


(t) Analysis and enhancement of system diagnosability. 
(7) Logic-design verification. 
(7iz) Generation of fault-detection tests. 

(tv) Analysis of faulty-circuit behavior. 

(v) Verification and evaluation of diagnostic-test designs. 
(vt) Production of trouble-location manuals (TLMs). 


The LAMP system has been designed to attack these problems in a 
systematic manner. 

This paper provides a brief description of the LAMP system organi- 
zation and features, and is intended to serve as background for the four 
following papers. These provide details of the logic simulators, the 
automatic-test-generation system, and the techniques for organizing 
system design for diagnosability.4-° They include a specific example 
of how LAMP was employed in the development of a large processor 
for an electronic switching system.’ 


Il. EVOLUTION OF THE LAMP SYSTEM 


The decision to build a machine-aids system with digital-simulation 
capability was motivated by the successful use by Bell Laboratories 
designers of the sequential analyzer.® The use of this simulator showed 
the great advantages of using simulation for logic testing and fault 
diagnosis. By 1966, Bell Laboratories was incorporating simulation 
techniques into the design cycle of electronic-switching-system equip- 
ment. However, there were several difficulties in the day-to-day use of 
this simulator. It had a restrictive logic model, long turnaround time 
due to remote computer location, and no capability for handling large 
circuits (for example, circuits having as many as 10,000 gates). Because 
no simulator then available could meet the growing demand for logic- 
simulation service, a decision was made to develop an advanced logic- 
simulator system which would grow and adapt to Bell Laboratories 
current and future needs. 

It is instructive that the motivation to develop a design-aids system 
came from the potential users of that system. Likewise, the initial 
design objectives and the evolution of the system were influenced to a 
large extent by the intended users. This has resulted in a very sophisti- 
cated, user-oriented system which continues to grow and evolve to 
meet the changing requirements of the designer. 
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The initial system was made available to users in late 1969 on IBM 
System/360 TSS at Bell Laboratories, Naperville, Illinois. It had only 
a modest set of features. However, the user reactions were generally 
favorable. Since then, substantial improvements in system performance 
and capabilities have been incorporated. The TSS version of LAMP 
was converted to run on IBM System/360 OS in mid-1970 and was 
made available to Bell Laboratories users at Holmdel, New Jersey, 
and Columbus, Ohio. Automatic-test-generation capability was in- 
corporated in early 1972; and the facilities to analyze system structural 
diagnosability were implemented in late 1972. The LAMP system is 
in active use in the development of many ESS projects as well as other 
non-ESS work such as the development of data-set facilities, trans- 
mission equipment, and advanced integrated-circuit technologies. The 
current user group includes twenty organizations from nine Bell 
Laboratories locations (Murray Hill, Whippany, Holmdel, Allentown, 
Columbus, Merrimack Valley, Indianapolis, Denver, and Naperville). 


HI. SYSTEM ORGANIZATION 


LAMP is a system of programs designed to be used for logic-design 
verification, evaluation of fault-detection tests and diagnostic pro- 
grams, and generation of the trouble-location manual (or fault dic- 
tionary) data. It is implemented on the IBM 360/370 TSS and OS 
machines (for both interactive and batch operations). The current 
version can handle circuits containing up to 65,000 gates. The system 
is composed of four basic parts: 


(c) A circuit-description-language compiler. 

(zc) A command-language interpreter. 

(azz) A collection of design tools composed of an automatic-test- 
generation (ATG) system ;‘ a controllability, observability, and 
maintenance engineering techniques (COMET"*) system; and 
a family of simulators.*® 

(cv) An output system. 


A block diagram showing the functional relationship of the various 
parts of the LAMP system is presented in Fig. 1. A logic circuit can be 
described to the LAMP system through a user-oriented language 
called LSL-LOCAL. The circuit description is then translated by the 
language compiler into simulation tables. The command-language 
interpreter directs all the actions of simulation, test generation, and 
diagnosability analysis in accordance with user-specified commands 
and information stored in the simulation table. 
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Fig. 1—Block diagram of LAMP system. 


For a given logic-circuit description, the ATG system can auto- 
matically produce the test-vector information. To verify logic design 
and to study faulty-circuit behavior, a family of s¢mulators can be used. 
The inputs applied to the simulators can be manually generated and/or 
generated by the ATG programs. The simulators are capable of simulat- 
ing circuit behavior in either fault-free or faulty mode, with facilities 
to handle race and oscillation conditions and to perform detailed timing 
analysis. 

If the purpose at hand is to determine the diagnosability of the 
design, the COMET system can be used to assist the users to organize 
systems design for diagnosability by systematically determining the 
optimum placement of control-access and monitor points. Simulation 
and analysis results are then collected under the control of an output 
system. Numerous output options can be specified that allow users 
to obtain information concerning logic verification, timing analysis, 
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and other data-processing information at the time of simulation or 
afterwards. 

In the following sections, the salient features of the various major 
functional blocks in the LAMP system will be described. 


3.1 Circuit description input language 


A logie circuit is described to the LAMP system through a user- 
oriented language called LSL-LOCAL. This language permits the 
entry of all information concerning the particular circuit either at the 
gate level or at the functional level. At the gate level, circuits are de- 
scribed in terms of logic elements such as NANDS, NORS, ANDS, ORS, 
and nots, whereas the functional level the circuits are expressed as 
memories, registers, clocks, etc. LSL-LOCAL was designed as an easily 
extendible language, to be used by circuit designers and diagnosticians 
who may not be trained as programmers. 

Once the circuit description is entered, the LSL-LOCAL language 
processor compiles the description into data tables to be used by the 
simulator(s), the ATG system, and the COMET analysis programs. 
The language processor has a substantial number of checks built into 
it to detect and intercept most errors before they can get into the sys- 
tem. These checks include syntax checks (for missing parameters, 
illegal characters, etc.) and circuit connectivity and consistency checks 
such as fan-in/fan-out limits. These features enable the users to check 
the coding of a circuit efficiently in terms of cost and time. 

The original version of the language processor was developed in late 
1969. Since then, three major revisions have been implemented to 
enhance its capability and performance. Many of the improvements 
were incorporated to support a wider range of applications, and the 
language has become a standard logic design input language in Bell 
Laboratories. 

As an example of the LSL-LOCAL circuit description, an exclusive- 
OR circuit as shown in Fig. 2a can be encoded as: 


CKTNAME: XOR; 
INPUTS: A,B; 


OUTPUTS: X; 
NOT: A’,A; 
B’, B; 
NAND: AB, (A, B’); 
BA’, (B, A’); 


AXB, (AB’,BA’), (X); 
(gate name) (input list) (output) 
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The description generally consists of three parts: (7) the CKTNAME 
statement, which introduces the circuit description and declares the 
name of the circuit; (77) connection declarations, which specify the 
names and the types of all the input/output signals of the circuit; and 
(ait) interconnection blocks, which specify elements and networks used 
in the circuit and how these are interconnected. The hierarchical struc- 
ture of the language allows the specification of circuits in a modular 
fashion. Thus, the exclusive-or circuit can be used as an element in 
describing a single-bit adder [see Fig. 2(b)]: 


CKTNAME: ADDERI; 
INPUTS: A,B, K; 
OUTPUTS: C, K_; 
XOR: A XB, (A, B), (X); 
D, (X, K), (C); 
NOT: A’,A; 
B’, B; 
NAND: ANB, (A, B); 


AORB, (A’, B’); 
AORBNK, (AORB, kK); 
K_, (ANB, AORBNK); 


These single-bit circuits can then be used to describe an n-bit adder or 
other more complex logic element(s). There is no explicit limit on the 
number of levels of nesting in describing circuits using LSL-LOCAL. 
A user can very conveniently construct the data base of a large circuit 
or system by combining the various data bases from its component 
circuit modules. 


3.2 Command system 


The control of LAMP system action for test generation, simulation, 
and COMET analysis is accomplished by means of a command- 
language structure. A large selection of interactive commands is avail- 
able which enables the users to compile and edit a circuit description, 
specify simulation-test vectors, make simulation runs, observe circuit 
behavior, gather circuit statistics, determine optimal placement of 
maintenance-access and observation points to enhance diagnosability, 
and specify types of simulation and analysis output. At present, there 
are approximately 80 commands in the system, many of which were 
implemented at the request of users. The commands are highly user- 
oriented so that one can easily learn the use of the system after a rela- 
tively minor amount of instruction. 
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SYMBOL 


SYMBOL 





CIRCUIT (b) 


Fig. 2—(a) Exclusive-or circuit. (b) One-bit adder circuit. 


The system structure is implemented with four levels of hierarchy. 
On the base level is the executive routine which reads commands en- 
tered by the system user and interprets them as to type. It then calls 
the appropriate routine to handle the command. On the next level are 
the command handlers whose functions are to process the command 
line and call the appropriate functional processors and service routines. 
On the third level are the functional processors; they are designed 
to perform specific functions such as simulation, circuit-description 
and test-vector compilation, circuit modification, processing control, 
and interactive control. On the fourth level are the various service 
routines that perform such tasks as gate-name retrieval, print control, 
vector translation, preliminary processing of data lines, file accessing, 
and printing. 
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To illustrate the richness of the command language, a few of the 
most commonly used commands for logic simulation are described. 
Referring to Fig. 3, to enter circuit descriptions into LAMP, the LSL- 
LOCAL encoding of the circuit will be first compiled (using SOURCE) 
and the resultant simulator tables loaded (using LOAD). A circuit can 
also be formed by combining several circuits into one using LINK. 
Should it become necessary to modify the circuit logic without recom- 
piling the entire circuit, then CKTCHANGE can be used to connect/ 
disconnect gates, add gates, and rename, change, or remove gates. 

The input test vectors for simulation can be described in either tri- 
nary (0, 1, and ‘‘don’t know’’), octal, or hexadecimal form (using 
INVEC), or in terms of vector names defined by PATTERN. In cer- 
tain applications, the STATE command is used to set the circuit-state 
variables to initialize a circuit before a simulation run. Internal gates 
of the circuit can be treated as additional circuit outputs or test points 
by issuing the MONITOR command. Conversely, normal circuit- 
output leads can be MASKED out for a particular run. 

The what, when, and how much of the simulation statistics that are 
to be processed after a run are defined through RESULTS. A simula- 
tion is initiated by the RUN command and can be temporarily halted 
by a STOP command. At a STOP, the user may interrogate the state 
of the simulation and obtain simulation statistics accumulated up to 
that point (by using the DISPLAY command), or he may overwrite 
the next input vector in the INVEC data set by issuing an ALTER 
command. The simulation can be resumed by issuing a GO command. 
If the user wishes to change the course of simulation during a STOP, 
he can use the JUMP feature to skip unwanted test vectors. 

To facilitate the use of the LAMP system in the production mode, 
many commands have been developed for analyzing circuit topology, 
gathering circuit statistics, and performing audits. Some examples are 
the CKTCHECK command to check the consistency of simulation 
tables and to provide statistical information such as counts of gate and 
functional types, average fan-in and fan-out for each type, percentages 
of types to total, etc., and the CKTSTAT command which prints a 
brief summary of circuit statistics including number of gates, number 
of circuit inputs, number of circuit outputs, number of clocks, and 
number of nonfaulted gates. For topological analysis, the LOOPS 
command allows one to identify all loops within a circuit or contained 
by a specified gate, the FEEDBACK command identifies the minimum 
number of feedback loops within the circuit, the PATH command finds 
the shortest path between a specified gate and any input, and the MSC 
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Fig. 3—Examples of commands used in simulation. 


command identifies all maximally strongly connected sets of gates 
within the circuit. All these commands have been proven to be ex- 
tremely useful, especially in the course of simulating large circuits 
(e.g., those containing 50,000 gates) under fault conditions.’ 

While the LAMP commands generally assume interactive use of the 
system (on 360/TSS), they also permit the use of the system in the 
noninteractive mode (such as 360/TSS batch or 360/370 OS). In these 
cases, some advance planning must be done to enable the runs to be 
completed successfully. 


3.3 Major tools 


There are three major tools in the LAMP system: an automatic-test- 
generation (ATG) system, a family of simulators, and a system for 
diagnosability analysis (COMET). Detailed descriptions of these tools 
are covered in the companion papers.‘—® The purpose of this section 
is to describe the salient features of these systems and to briefly de- 
scribe the interactions among them and the rest of the LAMP system. 


3.3.1 Automatic-test-generation (ATG) system 


ATG is a system of programs that can automatically produce the 
test-vector information for a given logic-circuit description. The faults 
considered are the classical input open, output stuck-at-one, and output 
stuck-at-zero for each gate in the circuit. There are two major differences 
between ATG and those test generators that have been discussed in 
the literature.’ First, ATG is capable of handling both combina- 
tional and sequential circuits without the need to identify feedback 
lines. Second, the system treats logic circuits as an interconnection 
of unit- and zero-time-delay gates, and thus improves the accuracy of 
the circuit modeling. 

ATG interacts with other parts of the LAMP system via the com- 
mand-language interpreter (see Fig. 1). A set of about 20 commands is 
available to the user to set the initial conditions (e.g., loading the cir- 
cuit description, specifying sequence length of the test), select test- 
generation strategies, specify output procedure, and direct the general 
course of action. The fault-detection level achieved by the tests gen- 
erated by ATG can be evaluated by using the fault simulators avail- 
able in the LAMP system. If the evaluation results indicate that the 
detection level is not adequate, ATG can be called again to generate 
more tests, by using different test strategies and/or changing the 
sequence length of the tests. This test-generation and evaluation loop 
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can be repeated several times until a specified level of detectability is 
achieved. 


3.3.2 Controllability, Observability, and Maintenance Engineering Techniques 
(COMET) 

Past experience has indicated that the effectiveness of diagnostic 
testing depends not merely on the techniques used in deriving tests and 
test results, but also on the inherent structural diagnosability of the 
unit.” The ATG system is a tool for aiding the derivation of test vec- 
tors for given circuits. The COMET system, on the other hand, em- 
ploys a technique that enables one to determine for a given circuit the 
optimal placement of control-access and monitor points for diagnostic 
testing. 

The COMET analysis is initiated by entering the connectivity of the 
functional blocks of a unit via LSL-LOCAL (see Fig. 1). The control 
and observation relations among the various functional nodes are 
automatically created from the connectivity (or simulator) tables. 
Through the use of the command-language interpreter, the user can 
then direct COMET to analyze, to examine, and to modify the topolog- 
ical structure of the unit. The modification of the structure for addi- 
tional control and/or observation is performed automatically, or it 
can be explicitly directed by the user. Once the design has been 
COMETized, it enjoys the following advantages: 


(t) Trouble-location-manual data can be generated and updated 
without the use of fault simulation. 
(2z) Multiple faults and all nonclassical faults are locatable if they 
are detectable. 
(iz) Diagnostic information can be easily updated in accordance 
with hardware change(s). 
(zv) An orderly approach to the implementation of an overall diag- 
nostic design is provided. 
(v) The fault-location procedure is substantially simplified. 


3.3.3 Logic simulators 


In the heart of the LAMP system are the logic simulators. These are 
the programs that actually perform the simulation of the circuit under 
test. A total of six simulators is available, each of which is designed to 


* Depending on the level of integration and the purpose at hand, a unit can be 
interpreted as a processor, a functional module, a circuit pack, or an LSI chip. 
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suit a particular condition.” Under the control of the command-lan- 
guage interpreter, one or more of the simulators can be called to 
simulate a particular circuit. The six simulators available in the LAMP 
system are: 


(2) 


(it) 


(iit) 


(wv) 


— 


(v 


(v2) 


True-value simulator—This simulator simulates only the true- 
value (or nonfaulted) conditions of the circuit. Simulation is 
done at the gate level. 

Fault simulator—This simulator can simulate the action of 
classical stuck-at-type faults (input open, output stuck-at-zero, 
and stuck-at-one) in addition to the true value. This enables one 
to study the behavior of faulty circuits, to evaluate the fault- 
detection capability of maintenance-check circuits and tests, 
and to generate diagnostic data for trouble-location-manual 
production. 

Timing simulator—This simulator allows the specification of 
individual rise and fall times of all gates in the circuit but does 
not simulate the effect of the stuck-at faults. It is designed pri- 
marily for detailed timing analysis to verify that the circuit 
will work under worst-case conditions. 

Parallel simulator—The features of this simulator are similar 
to the ones available in the fault simulator. The major differ- 
ence is that the parallel simulator employs a technique whereby 
the true value and a small set of faults are simulated con- 
currently. 

Shorted-fault simulator—This simulator allows for simulation 
of nonclassical faults such as crossover shorts and shorts be- 
tween adjacent paths. It is useful in aiding the design of manu- 
facturing tests for circuit pack check-out. 

Functional simulator—This simulator allows one to simulate 
the circuit behavior at a higher level (e.g., registers, memories, 
etc.) than at the gate level. Functional simulation is most useful 
in verifying initial logic design where detailed knowledge of 
gate-level logic is not available or the function(s) cannot be 
conveniently modeled by gate-level techniques. 


The cost effectiveness of the LAMP system depends on the user’s 
choosing the correct simulator or simulators for use in his application. 
Consequently, it was found necessary to combine the results of more 


* This was found desirable and cost effective especially in a production environment 
where system performance and accuracy are often weighted against each other in the 
search for an optimum mix. 
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than one simulator if the model of one simulator is not sufficient for a 
particular application. This is accomplished by the output system. 


3.4 Output system 


In LAMP, a versatile output system is available that enables users 
to collect simulation and analysis results in one of several different 
formats (or in user-generated formats). Outputs may be specified at 
any time during or after the run. The results of several simulation runs 
may be combined together at some point after the simulation has taken 
place to produce the desired output. Simulation runs that are so com- 
bined may be from different simulators. All these options can be 
specified by the command language. 

Among the various output options available, some of the most com- 
monly used will be described here. To verify the validity of the logic 
design, the VALUES option is often used, which lists the inputs and 
outputs along with the (1, 0, and ‘“‘don’t know’’) values of outputs for 
a given input test vector. In some cases where one is interested in 
internal states of the circuit, one can use GATEIO option to display 
the value of selected gates and their driving and driven gates. This 
feature is especially useful during a simulation run when the run is 
temporarily halted or has gone into oscillation; another specific use of 
this feature is to display circuit connectivity. Another format often 
used to display the outputs of timing and the true-value simulators is 
TLGRAPH. TLGRAPH is an oscilloscope-like trace of the signals on 
the output gates, from the time the test is applied until the time the 
circuit settles down. Whenever the value of an output gate changes, 
the time interval is recorded as well as the output gate values. This 
format has proven to be extremely valuable in studying worst-case 
timing conditions. 

A variety of output formats is also available for studying the com- 
pleteness, accuracy, and resolution of diagnostic tests. The ATP format 
lists all the faults that have not been detected for the test sequence 
simulated. The RAW output format lists the output gate name, each 
gate’s true value as well as the number of faults that causes each gate’s 
true value to be complemented, and a listing of these faults. For a large 
run where a user is interested in only a summary of the run, the 
MATRIX output can be used to show the faults detected by each test ; 
the result is presented in the form of a matrix or a fault table. If the 
user is interested in fault partitioning and diagnosability information, 
he can choose the TREE output that lists the test results in the form 
of a diagnostic tree by grouping all those faults causing the circuit to 
behave in the same manner for a particular test sequence. 
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Facilities are also provided to allow the user to write his own output 
processing program. The raw output data set (RAWDS) contains all 
the raw data on the output gates from a simulation, including informa- 
tion such as the input vector on each test for which raw data are col- 
lected, names of inputs and outputs, fault cross-referencing informa- 
tion, fault and nonfault information, and certain circuit statistics. 
The user can manipulate this information to create the desired output 
format. The availability of this feature has substantially reduced the 
burden that otherwise would be imposed on the LAMP system de- 
velopers to meet the wide variety of user needs. 


IV. THE ROLE OF LAMP IN THE DESIGN PROCESS 


The process by which a logic design becomes a completed product 
has become very complex with the advent of integrated-circuit tech- 
nology. This process is made even more difficult in the telephone in- 
dustry because of the stringent up-time requirement of the switching 
machines." The ability to diagnose any equipment failure thus be- 
comes an important consideration in the design and implementation 
of these machines. 

The design and implementation process for a new switching system 
processor is made feasible by the constant use of computer-aided- 
design tools. Figure 4 shows the overall implementation process from 
the initial logic designs through to the completed processor. It also 
illustrates how the various major features of the LAMP system can be 
used in each design step. 

The start of any major logic design project is the specification of the 
system architecture along with the basic design decisions. The COMET 
feature of LAMP helps this process by providing information about 
the diagnosability of a proposed design. With this tool, the global 
diagnosability of a system design can be established. Once this overall 
design step has been completed, the logic can be partitioned into indi- 
vidual circuit packs and detailed circuit designs can begin. In this 
phase of the design, the designer uses the true-value simulator for 
design verification, and frequently uses the timing simulator to make 
sure that the logic-timing functions are correct. 

The use of these simulators requires that the logic circuit be encoded 
in the LSL-LOCAL language. The encoding of the circuit in the LSL- 
LOCAL language at this point accomplishes two basic functions. The 
first function is to catch any circuit discrepancies through the use of 
audits in the language processor and the second is to provide a machine- 
readable form of the circuit design. This latter function is basic to the 
entire computer-aided-design function. 
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Fig. 4—Diagram of LAMP system use in logic-circuit design. 


In addition to the basic circuit information, it is possible to input 
physical design information through the LSL-LOCAL language. When 
the designer is satisfied with the design of the circuit on a circuit-pack 
basis, the verified logic is then used as a base for the physical design 
process. Here the various additional machine-aided tools are used to 
perform partitioning, placement, and routing. The successful com- 
pletion of physical design thus establishes a logical and physical design 
data base from which other uses of LAMP in the design process may 
take place. Some examples of these activities are: (2) derivation of 
circuit-pack diagnostic tests for manufacturing check-out purposes; 
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(ii) design verification of the subsystems (which are formed by com- 
bining circuit packs) and the complete processor (which is formed by 
combining the subsystems) ; and (777) design and verification of diag- 
nostic program(s) and generation of TLM data. 


V. EXAMPLES OF LAMP SYSTEM USE 


To provide some insight into the use of the LAMP system, a few 
examples of simple procedures performed with the LAMP system are 
presented. Because of the large number of ways the LAMP system is 
utilized, it is impossible to cover more than a small area of the system 
functions. The examples shown, however, are representative of typical 
activity. 

All user communication with the LAMP system is by use of a com- 
mand language. Each command represents an action to be taken by 
the system. In conversational use, the system prompts for the next 
input by means of a > character. Some commands which require addi- 
tional information prompt the user with an @ character. 


Example 1—Logic Verification Run 
(TSS Log-on Procedure) 


System: LAMP DESIGN AUTOMATION SYSTEM 
ENTER COMMANDS 
> 
User: load expl. tables 
System: CKTNAME: EXAMPLE.CIRCUIT VERSION 06/24/73 
> 
User: run tval expl.test.vector,expl.output.results,p 
System: LAMP TVAL SIMULATOR—VERSION 2.5 
> 
User: display values,t 
System: AT INPUT NO. = 3 
INPUTS: SA SB CA CB 
SEN CEN 
OUTPUTS: SOUT COUT 
INPUTS: 100001 
OUTPUTS: 11 
> 
User: end 


In this example, the user desires to test the ‘‘good” operation of his 
logic design by exciting his circuit with a series of prestored input 
vectors. The circuit description has been previously compiled from an 
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LSL-LOCAL encoding into a data set called ‘‘expl.tables.’”? The pre- 
stored input vectors are located in the data set ‘‘expl.test.vectors.”’ 
Since he is not interested in fault analysis, the TVAL (true-value) 
simulator is chosen. For nonfaulted operation, this simulator is the 
most efficient of the six available. The results he needs for his analysis 
can be obtained in two ways. The bulk of the output is produced via 
the computer high-speed printer. The particular types of results the 
user wants are specified by the contents of data set ‘‘expl.output. 
results.”’ The “‘p”’ indicates that the results are to be printed as soon as 
possible. Because the user wants a quick check of some of the results 
before the other output is available, he instructs the system to display 
the input and output gate names along with their associated output 
values on the terminal after the simulation is completed. Satisfied with 
the results, he ends the simulation. 


Example 2—Creation of the Controlling Data Sets 
(TSS Log-on Procedure) 


System: LAMP DESIGN AUTOMATION SYSTEM 
ENTER COMMANDS 
> 
User: source Isllocal expl.source,expl.tables 
System: LOCAL LP START 
LOCAL LANGUAGE PROCESSOR—VERSION 3 
LOCAL LP END 
> 
User: results expl.output.results 
System: ENTER SIMULATION RESULTS SPECIFICATIONS 


@ 
User: after input «every; values 
System: @ 
User: [default] 
System: > 


User: invec expl.test.vectors 


System: ENTER INPUT VECTORS 


@ 
User: 1'101031’ 
System: @ 
User: +‘100001’ 
System: @ 
User: [default] 
System: > 
User: end 
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In this example, the user creates the data sets used to control the 
simulation run shown in Example 1. The first action is to compile the 
logic-circuit description written in LSL-LOCAL that has been stored in 
data set “‘expl.source” in a form that the compiler can use. The com- 
piled information is stored in data set ‘‘expl.tables.’’ Next the data set 
(‘“expl.output.results’’) that controls the output results is formed by 
use of the RESULTS command. The information put into this data 
set will instruct the simulator to print the values of the inputs and out- 
puts after every input vector has excited the circuit. 

Finally, the series of input vectors used to excite the circuit is cre- 
ated by use of the INVEC command. In this case, a series of these input 
vectors has been created. The input value ‘‘3”’ signifies a ‘‘don’t care”’ 
value. 

Only a few of the available commands and options have been shown. 
However, these should provide an idea of the ways in which the system 
can be used. Additional examples will be presented in the other papers 
of this series to illustrate specific points under discussion. 


Vi. CONCLUSION 


Present and future designs of digital systems require computer aids 
during all phases of development, from initial architecture specifica- 
tions to diagnostic-test design. The efficiency of these tools in per- 
forming their intended functions is of great importance, from both 
internal (efficiency of algorithms) and external (user convenience and 
usefulness) considerations. Viewed in this light, the LAMP system has 
been an outstanding success. The use of LAMP has been found to be 
cost effective in that LAMP provides the designers a convenient facility 
to assure design quality, to expedite error correction, and to reduce 
design-rework cost. LAMP also offers the designer a versatile tool to 
evaluate and verify the system diagnostics before hardware is com- 
mitted. It has become an integral part of the design of new electronic 
switching systems and has strongly influenced the methodology of their 
design. 

The other papers in the series will give more detailed descriptions of 
the use and design of selected portions of the LAMP system. 
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Logic-Circuit Simulators 


By S. G. CHAPPELL, C. H. ELMENDORF, and L. D. SCHMIDT 
(Manuscript received February 20, 1974) 


The algorithms used for logic-circuit simulation in the Logic Analyzer 
for Maintenance Planning (LAMP) system are described. Several simu- 
lators are available to allow a cost-effective tradeoff between simulation 
cost and the level of detail needed for a particular application. The true- 
value simulator provides efficient simulation of fault-free logic circutts. 
Two fault simulators simulate the classical stuck-at faulis as well as 
shorted-gate-output faults. Hyperactive faults, those faults which cause an 
inordinate amount of simulation activity, are discussed along with their 
impact on simulation time. A four-value simulation logic is described 
which simplifies circuit initialization procedures. 


I. INTRODUCTION 


The use of digital simulation of logic circuits has been widely 
accepted in the computer and telephone industries to verify logic- 
circuit designs, to analyze the behavior of logic circuits in the presence 
of faults (such as gate outputs permanently stuck at logical 0 or 
logical 1, open gate inputs and shorted gate outputs), and to aid the 
generation of fault-detection tests for logic circuits. 

Most simulators described in the literature can be divided into three 
categories. The first category includes the true-value simulators that 
simulate the circuit in the absence of any faults or, by altering the 
circuit description, simulate the circuit in the presence of one perma- 
nent fault.! The second category includes the parallel simulators that 
concurrently simulate the fault-free circuit and the effect on the circuit 
of a small set of single permanent faults.2-* The third category in- 
cludes the deductive simulators that concurrently simulate the fault- 
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free circuit and the effect on the circuit of all single permanent faults.® 
The Logic Analyzer for Maintenance Planning (LAMP) system con- 
tains simulators from each category. 

The LAMP system has been extensively used over the last four 
years to simulate the No. 1A and No. 4 Electronic Switching Systems 
to verify the logic design, to aid the generation of diagnostic tests, and 
to analyze the behavior of the circuits in the presence of faults. Circuits 
containing 52,000 gates and 23,000 single faults have been simulated 
using the IBM 370 Model 168 as the host machine. 

The simulators in the LAMP system provide a complete range of 
capabilities for the design of logic circuits. Circuits and subsets of 
circuits can be simulated at the gate level (NAND, AND, OR NOR, 
NOT), at the functional level (register, memory, etc.) or at the hybrid 
level (a combination of gates and functions). At the gate level, gates 
can be modeled in sufficient detail to account for variations of such 
parameters as temperature and wiring capacitance. Several different 
classes of faults can be considered including gate outputs stuck at 
logical 0 or logical 1, gate inputs open, and shorted gate outputs. 
Facilities have been provided to help the user debug his logic design 
and his diagonstic tests.° 

This paper presents a description of the LAMP simulators. In the 
first section, the family of simulators are described including an 
example of their use in the design of a logic circuit. This is followed by 
a description of the common attributes of the LAMP simulators. In 
the second section, the basic LAMP simulator for fault-free circuits 
is described and is the basis for describing the other LAMP simulators. 
In the next sections, descriptions of the deductive fault simulators 
and functional simulators are presented. In the seventh section, the 
detection and elimination of a class of “hyperactive’’ faults is de- 
scribed. Finally, data on the performance of the various simulators 
are presented. 


I. THE SIMULATOR FAMILY 


This section describes the use of the various LAMP simulators 
during the design of a logic circuit. This is followed by a description 
of the common attributes of LAMP simulators. 

As the level of logic-circuit integration increases, it becomes more 
difficult to build “breadboard” models. This often means that more 
emphasis must be placed on the results of logic-circuit simulation. 
Therefore, it is desirable to use an extremely accurate simulation model 
of the logic circuit. Unfortunately, as the accuracy (level of detail) 
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of the model increases, so does the cost of simulation. Since LAMP 
was designed for large circuits (up to 65,000 gates), cost is an im- 
portant parameter. One way to partially circumvent this problem is 
to utilize several different simulators, each of which provides a detailed 
model especially tailored to optimize the simulation of a physical 
circuit. 


2.1 Use of simulation during circuit design 


Consider the design of a small processor. Given the overall specifica- 
tions for the processor, the designer can create a functional level model 
of the circuit where the building blocks include registers, memories, 
decoders and an arithmetic unit. Using the functional simulator, the 
design can be simulated at the functional level to verify the operation 
and timing of the processor. The processor can now be divided into 
functional units for detailed logic design of each unit. The functional 
units may be further divided into circuit packs containing a few 
hundred gates each. 

The detailed logic design of the circuit pack is performed and is 
verified using the LAMP true-value simulator. The true-value simulator 
simulates only the fault-free circuit by modeling the logic gates as 
logic elements followed by pure time delays. This is a fast, economical 
simulator. 

If the timing of the signals on the circuit pack is critical, the designer 
may wish to perform a more detailed timing analysis of his circuit 
using the LAMP timing simulator. The timing simulator’ allows each 
gate to be assigned minimum and maximum time delays for the rising 
and falling signal transitions. The gate output is treated as unknown 
during the time between the minimum and maximum transition 
delays. This provides a more detailed analysis of circuit behavior in 
the presence of variations in gate time delays resulting from such 
factors as temperature change, gate loading, and capacitance. In 
addition, gate input pulses of shorter duration than the minimum 
transition delay are ignored and, therefore, do not affect the gate 
output value. 

Once the designer has verified that his logic-circuit design meets 
the operational specifications, he must generate manufacturing test 
vectors (circuit input stimuli) to verify the integrity of the newly 
manufactured circuit pack. Whether the designer creates the test 
vectors by hand or uses the automatic-test-generation system,* he may 
use the LAMP fault simulator to evaluate the quality of the resulting 
set of input test vectors. The fault simulator simulates the effect on a 
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logic circuit of the presence of all single classical (gate input open, 
gate output stuck-at-zero, and gate output stuck-at-one) faults. This 
is a deductive simulator’ that associates with each gate output a 
fault list containing those faults that will complement the correct 
(true) logic value (logical 0 or 1) of that gate. The fault lists may 
contain any number of faults, which theoretically allows the simul- 
taneous simulation of all classical faults. Because of the effort required 
to process the fault information, the fault simulator is considerably 
more expensive to use than the true-value simulator. Through the 
use of the fault simulator, tests can be designed, or the circuit can be 
modified, to attain the desired level of fault detection. 

If the number of faults to be simulated is less than a few thousand, 
it may be more economical to use the LAMP parallel simulator instead 
of the fault simulator. The parallel simulator uses parallel fault-simula- 
tion techniques?‘ to simulate up to 2048 single classical faults in one 
pass. A variable-width-fault word is utilized so that simulation time 
and storage are minimized. The relative merits of the parallel and 
deductive fault simulation techniques are presented in Ref. 9. 

After the chip layout and printed-wire routing for the circuit pack 
is complete, the designer may choose to examine the effectiveness of 
his classical fault tests against possible shorted faults using the LAMP 
shorted-fault simulator, which simulates the effect on a logic circuit of 
the presence of single pairs of gate outputs shorted together. If two 
gate outputs, A and B, are shorted together where gate A has the 
value logical 1 and gate B has the value logical 0, it is assumed that 
the logical 0 will dominate and the output of gate A will be forced to 
logical 0. A user option is available which forces logical 1 to dominate 
logical 0. Potential shorted faults that may be simulated include 
shorted adjacent pins on chips, shorted adjacent paths on the printed 
wiring board, and shorted crossover points on the printed wiring 
board. These data are obtained from the manufacturing information for 
each circuit. The shorted-fault simulator uses the deductive simulation 
technique. 

After the circuit packs are designed, the designer can link all the 
circuit packs together to form the complete processor and perform 
the same logic verification process on the larger circuit with a few 
minor differences. The true-value and timing simulators are used both 
to verify the logic design of the processor and to verify the diagnostic 
program for the processor. The various fault simulators are used to 
evaluate the effectiveness of the diagnostic throughout the design- 
change cycle until the design is complete. 
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2.2 Common simulator attributes 


The common attributes of the LAMP true-value, fault, timing, 
shorted-fault, parallel, and functional simulators are described below. 
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The version of LAMP that is described is implemented on 
the IBM 360 Model 67 and IBM 370 Model 168 under the 
IBM interactive, virtual-memory operating system TSS. A 
version of LAMP is also available under the IBM operating 
system OS. 

The first version of LAMP (1969) contained only the fault 
simulator. New simulators have been implemented as needed, 
and existing simulators have been improved to produce the 
complete system for logic simulation now available in LAMP. 
The simulators can be accessed from an interactive terminal 
or used in the batch mode via card input or prestored data. 
Interactive features include the ability to temporarily stop 
the simulation when any specified gate changes value and the 
ability to correct from the terminal errors in the circuit design 
or input data. 

Logic circuits are simulated at the gate level (NAND, AND, 
NOR, OR, and NOT) except in the functional simulator, which 
also accepts descriptions of higher-level blocks such as 
memories and registers. 

Four simulation values (0, 1, 2, and 3) are used to simulate 
binary-logic circuits. The simulation values 0 and 1 represent 
the logic values 0 and 1. Values 2 and 3 represent unknown 
conditions in the logic circuit. This is explained in more 
detail in Section 3.2. 

Conditions that cause the output values of flip-flops to be 
unpredictable are detected and the flip-flop outputs are forced 
to the unknown state 3 by a process called race analysts. 
Possible circuit oscillations are detected by a process called 
oscillation analysis. Both procedures will be described in more 
detail in Sections 3.3 and 3.4. 

LAMP uses discrete event simulation where all activity occurs 
at integral multiples of the basic increment of simulation time. 
The basic increment definition is arbitrary and may represent 
such units as nanoseconds, microseconds, or gate delays. 
Lists, called timing lists, are maintained by each simulator 
such that one timing list is associated with each increment of 
simulation time. Each timing list contains a list of gate- 
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output changes scheduled to occur at that increment of 
simulation time. The timing list associated with the current 
increment of simulation time is called the current timing list. 

(viit) Selective trace is used so that a gate output is computed only 
if any of the gate’s input signals changed value. 

(ix) The circuit description is contained in a set of two-way, 
linked-list tables, which include information about each gate 
such as the driving and driven gates, logic function, time 
delay, and faults to be simulated. A subroutine, associated 
with each logic function, examines the gate-input values, 
computes the new output values, determines whether the 
output values have changed, and schedules the output change 
(if any) into some future timing list. 


Nl. THE TRUE-VALUE SIMULATOR 


The operation of the true-value simulator will be used as the basis 
for the presentation of the fault simulators. A simplified flow chart of 
the operation of the true-value simulator is shown in Fig. 1. This 
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Fig. 1—Simplified simulation flow. 
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flow chart is also used in Section 3.5 to describe the overall simulator 
operation. 


3.1 The true-value calculation 


The LAMP simulators use four logic values, 0, 1, 2, and 3, to simulate 
the Boolean logic functions. The 0 and 1 are simply the logical 0 and 
1 of Boolean algebra. Values 2 and 3 represent nonpropagating and 
propagating ‘‘don’t-know” conditions, respectively. The gate output 
calculation occurs in Step 5 of Fig. 1. 

Value 2 is used to allow efficient initialization of the circuit. Prior 
to a simulation run, all gates are initially assigned a value of 2. Its 
nonpropagating feature is demonstrated by the following table of a 
two-input NAND gate: 


A B A-B 
2 0 1 
2 1 Q 
2 2 Q 
2 3 Q 


where Q means no change in the previous true value. 

The nonpropagation is necessary to prevent destroying information 
specified by setting a priort the state of the circuit. For example, in 
Fig. 2, if the state specification sets C = 0, nonpropagation is necessary 
to prevent the true value of C from being overwritten by a don’t 
know. Value 2 allows C = 0 to initialize the flip-flop to C = 0 and 
D = 1. A more detailed explanation of the behavior of 2s will be 
presented in the next section. 

True-value 3 is a true “‘don’t know” with full propagation features. 
The truth table for a two-input NAND gate is shown below: 


A B A-B 
3 0 1 
3 1 3 
3 2 Q 
3 3 3 


where Q means no change in the present true value. 
In Fig. 2, if all 2s were replaced by 3s, then the output of C and D 
would become 3 even though the user initialized C to logical 0. 
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Fig. 2—nanp flip-flop. 


3.2 Initial-state analysis 


The purposes of the initial-state analysis (Step 1 in Fig. 1) are: 
(2) to extract as much information as possible from the user-specified 
circuit state (if any), and (77) to guarantee that the output of each 
gate is consistent with its inputs. A flow chart of this procedure is 
shown in Fig. 3. 

True-value 2 is used only during the initial-state analysis which 
occurs before the first input vector is applied to the circuit. The 
initial-state analysis is a three-pass procedure that attempts to propa- 
gate the effect of any user-specified state through the circuit. Pass 1 
has two alternatives. If the user did not set any state, then pass 1 
simply changes all of the gates whose output value is 2 to the ‘‘true”’ 
unknown-value 3 and the simulation of the input vectors begins. 

However, if the user has set some initial state, then the initial-state 
analysis must propagate the effect of that state through the circuit. 
During pass 1, the circuit contains the logic-value 2 for the ‘‘don’t- 
know”’ condition. The nonpropagation feature of the 2s allows as 
much information as possible to be extracted by the simulator using 
only a forward simulation. No attempt is made to set the inputs of 
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Fig. 3—Initial-state pass. 
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some NAND gate to logic-value 1 if the output of the gate is logical 0. 
Thus, LAMP requires that initial states should always be set using 
the ‘‘dominant”’ value of the particular logic type used. For example, 
because gate C of Fig. 2 was set to a logical 0, pass 1 would set D to 
value 1. 

Pass 2 goes through the circuit and changes selected remaining 
gates whose output value is 2 to new output-value 3. This is necessary 
because the 3s propagate where the 2s do not. Therefore, leaving the 
2s in the circuit can cause incorrect simulation results. However, it is 
only necessary to change to 3 those gates within maximally strongly 
connected subgroups (MSCs)! having output-value 2. This occurs 
because the circuit inputs are assumed to support any state which 
the user sets. Therefore, the input gates as well as any combinational 
circuitry driven by the inputs maintains true-value 2 until it is 
eliminated by the first input vector or the next pass. 

Pass 3 propagates the newly injected 3s as far as possible. This may 
have the effect of destroying some incomplete state which the user 
specified because the circuit is unable to support the incomplete state 
for all possible complete states. If a complete self-supporting (stable) 
state is specified, no state information will be eliminated. 

Initializing the circuit to some known value can introduce simulation 
inaccuracies during fault simulation. If the circuit is artificially 
initialized, there is no record of those faults whose presence would 
prevent the circuit from reaching the initial state specified. Therefore, 
it is preferable to apply a synchronizing sequence to the circuit to 
drive it from an unknown state (all gate output values set to 3) to 
some known state. The facility to artificially initialize the circuit is 
provided to help the user and to simplify his work." 


3.3 True-value race analysis 


Race analysis (Step 3 in Fig. 1) is performed on the basic NAND 
and nor flip-flop. Previous simulation techniques attempted to treat 
the flip-flop as a ‘“‘black box.’’ However, the ‘‘black box’’ approach 
leads to inaccurate simulations or to unwieldy simulation algorithms. 
Therefore, the technique used in LAMP is to detect races as invalid 
conditions on a set of gates. Since both NAND and Nor flip-flops 
are handled in a similar manner, only the true-value race analysis for 
the NAND flip-flop will be discussed here. The basic NAND flip-flop is 
shown in Fig. 2. 

The true-value race state for a NAND flip-flop is A = 1, B = 1, 
C = 1, and D = 1 at the same time, ¢. From this state, it is impossible 
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to predict (assuming identical gate behavior) whether C = 1 and 
D =0 or C = 0 and D = 1 when the flip-flop settles. So as not to 
arbitrarily resolve races, true-value 3 is assigned to the output of both 
gates in the flip-flop. 

To accomplish this, when the flip-flop was in the A = 1, B = 1, 
C = 1, and D = 1 state at time /, the simulator calculates C = 0 and 
D = 0 for the new intermediate output to be scheduled into a future 
timing list. Since the state C = 0 and D = 0 is impossible unless the 
previous state was A = 1, B = 1, C = 1, and D = 1, both outputs at 
logical 0 provide an efficient race-detection mechanism.” Also, since 
C = 0 and D = 0 are unstable, both C and D will be scheduled to 
change values at the present increment of simulation time. Therefore, 
the outputs of a NAND flip-flop are set to true-value 3 and a race 
declared when: 


(t) The newly calculated, but not yet assigned, outputs of both 
gates are simultaneously 0. 

(77) Both gate outputs are scheduled to be changed at the present 
time. 


If the NAND gates are cross-coupled, as shown in Fig. 2, but are 
not specified as a flip-flop, then race analysis will not be performed. In 
this case, if the flip-flop is in a race state, the new output C = 0 and 
D = 0 will be assigned to the gates in the flip-flop. The next output 
(assuming the inputs to the flip-flop do not change) will be C = 1 and 
D =1 and the flip-flop will oscillate between C = 1, D = 1, and 
C = 0, D = 0 causing a simulator oscillation. 

In addition, because of the behavior of value 3, the condition where 
the newly calculated output values of the flip-flop are C = 0 and 
D = 3 or C = 3 and D = 0 will cause an oscillation. Therefore, this 
condition is also detected and declared to be a race. Extensive topo- 
logical circuit analysis could isolate the undeclared flip-flops, but such 
analysis is not performed since the circuit designers seldom fail to 
declare the race-pair gates. 


3.4 True-value oscillation analysis 


A true-value oscillation (Step 3 in Fig. 1) occurs when the circuit 
state is unstable as a result of some input conditions. An oscillation is 
declared if the simulator simulates an arbitrary number, N, of incre- 
ments of simulation time and the circuit has not stabilized. The value 
of N is defaulted to be the number of gates in the logic circuit but can 
be adjusted by the user. 
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If a true-value oscillation is detected, the old and new true values 
for every gate B whose output is changing at the present increment of 
simulated time are compared. If the old and new true values are 
different for gate B, the new true value is replaced with value 3 since 
the output of B is changing (i.e., unknown). Value 3 is the new gate 
output that will be scheduled in some future timing list. When 3s are 
inserted, the oscillation automatically stops, since a 3 represents both 
aQandal. 


3.5 The true-value circuit model 


The true-value circuit model defines the interactions among the 
initial-state-analysis, gate-calculation, race-analysis, and oscillation- 
analysis steps that were presented earlier. Thus, a description of the 
true-value circuit model is an overall description of the simulator 
operation. 

A simplified flow chart of the basic simulator operation is shown in 
Fig. 1. The operation includes the following: 


Step 1—The circuit is analyzed to check the validity and consistency 
of any user-supplied initial state, as described in Section 3.1. 


Step 2—This step is repeated once for every circuit input vector to 
be simulated. During this step, the next input vector is obtained 
and the new input values are assigned to the circuit input leads. The 
effect of this input vector on the circuit is now propagated through 
the circuit. Every input gate whose value changed as a result of the 
new input vector is put into the appropriate future timing list. 
The future timing lists are examined, as the simulation time is 
incremented, until the first nonempty timing list is found. This 
timing list is called the current timing list. Let the present time be 
t) and assume that the set of gates G, {Gi, 7 = 1, 2, ---, m}, in the 
current timing list at t) contains all the gates whose outputs are 
changing at time to. Steps 3 through 6 are performed once for each 
timing list. 


Step 3—Race analysis is performed for each declared flip-flop 
formed by two gates, both of which are in G. 


Step 4—The new outputs are assigned to every gate in G. 


Step 5—After all the new outputs of G have been assigned, the 
output of each gate H;, 7 = 1, 2, ---, n, which is driven by any 
G; whose output has changed, is calculated according to gate model. 
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Step 6—If the output of some H;, 1 S$ k S n, changed, then HA, 
is put into the timing list of gates whose output may change at 
time ¢) + ¢,, where ¢, is the transition time for H,. If the output of 
H,, did not change, no further action is taken on the gate. The 
important feature of this circuit model is that the gates H;, 7 = 1, 
2, :-+, n, have their inputs calculated based on all new values of 
the gates in {G:;, 7 = 1, 2, ---, m}. That is, every change that is 
going to occur at to occurs before the output of any gate driven by 
any of the gates in G is calculated. 


Step 7—Simulation may be allowed to continue or it may be inter- 
rupted to process a change on the input leads (Step 9) or to return 
to the command language to process user commands. 


Step 8—The simulation time is incremented. This makes the timing 
list at time ¢ + 1 the current timing list and the loop continues. 
Simulation is terminated if there are no more input changes. 


IV. THE FAULT SIMULATOR 


The fault simulator utilizes Armstrong’s® fault-list concept to allow 
concurrent simulation of all open gate input, output stuck-at-one 
(SA1), and output stuck-at-zero (SAO) faults in one pass per input 
vector. The input-open fault is assumed to force a nondominant value 
on that input. For example, for NAND and AND gates, the input 
open is assumed to force that gate input to logical 1. A number from 
0 to k — 1 is associated with each of the k faults in the circuit. Each 
gate G, except the inverter, is assigned N + 2 faults, where N is the 
number of inputs to gate G. The inverter has only the two output 
SA1 and SAO faults, since the input-open fault is indistinguishable 
from the output SAO fault. These fault numbers are then carried in 
fault lists associated with each gate. The hard faults, or corresponding 
fault numbers, in the fault list on gate G represent exactly those faults 
in the circuit that will cause the true value (logical 0 or 1) of gate G 
to be complemented. Only gates having 1 and 0 true values can have 
fault lists. Similarly, the star faults in the fault list on gate G represent 
faults in the circuit for which the value of G is not predictable by the 
simulation model. 


4.1 Fault-simulator gate calculation 


The fault-simulator gate calculation (step 5 in Fig. 1) involves the 
manipulation of the fault lists on each gate using the fault-list algebra. 
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In the description of the fault-list algebra, each fault list is treated as 
a set. The three set operations used for fault-list calculation are union 
(LU), intersection (()), and difference (9). 

The union of two fault lists A and B is defined for some fault f to 
form the output-fault list F: 


Union Operation A LU B 








B 
F d f *f 

A x f *f 
f f f f 

+f a” f ist 


where 


*f = star fault corresponding to fault f, 
\ = absence of f and *f from the set (fault list). 


The intersection of two fault lists, A and B, is defined for some fault 
f to form the output-fault list F: 


Intersection Operation A () B 


B 
FP d Bs i. 
A d aN rd 
ra d f “s 
*f d a sj 
The difference of two fault lists A and B is defined for some fault f 
to form the output-fault list F: 


Difference Operation A 6 B 





B 
F r ik 5 
A d d n » 
f f d ee 
nS | » “f 
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Fig. 4—Fault-list calculation. 


For an m input NAND gate G in Fig. 4, let: 


= Fault list on the gate driving the ith input of G 
(i sism). 
f; = Input open on the 7th input of G. 
fsax = Output of G stuck at k (k = 1, 0). 
F = Resulting fault list on gate G. 
k means form the union over all the fault lists on input 
LU leads whose true value is logical k, k = 0, 1. 
j means form the intersection over all the fault lists on 
(\ input leads whose true value is logical 7, 7 = 0, 1. 


To calculate the new output fault list F from the input lists F,, 
1 S72 S ™, consider the following cases. First, assume all m inputs are 
logical 1. The output true value is 0 and 


F= {U (F; 8 {fi})} © {fsao} U {fsaa}. (1) 


This equation means that the output SA1 fault plus any fault on 
any input, except the respective input-open faults and the output SAO 
fault, can cause the correct gate output to change values. 

Second, assume that all inputs are logical 0. Then the output value 
is 1 and 


F= {M (FU {fi})} U { fsao} S) { fsar}. (2) 


This equation means that the output fault list contains the SAO fault 
plus any fault present on every input lead. A fault is present on an 
input lead if it occurs in the lead’s fault list or is the lead’s input open 
fault. The output fault list does not contain the SA1 fault. 

Third, assume that some inputs are logical 0 (those denoted by 2) 
and the remaining inputs are 1 (those denoted by 7). The output 
value is 1 and 


F= {IN (FU (f3)1 © CU (Fs {F5})} U ffsac} © {fsar}. (8) 


The meaning of the equation follows directly from the meaning of 
eqs. (1) and (2). 
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Fourth, if there is a value 2 or 3 on any input and a logical 0 on 
some other input, then the output true value is 1 and F = { fgao} only. 

The fault-list computation equations can be derived by considering 
two input gates. Consider a two-input gate G with inputs A and B. 
If A = B = 1, then eq. (1) can be shown to be true by exhaustive 
analysis. Similarly, if A = B = 0, then eq. (2) is obviously true. Again 
if A = 1 and B = 0, then eq. (8) is true. The NAND is simply AND 
followed by a Not gate and the AND operation is associative and 
commutative. Then eqs. (1) and (2) represent simple cascades of pairs 
of two input gate operations. Similarly, eq. (3) means treat all the 
logical 0 inputs as one AND gate Go, then all the logical 1 inputs as 
an AND gate Gi, and then form the difference of Gp and G,. In this 
explanation, the internal faults were ignored. However, their handling 
is apparent from eqs. (1) through (8). Equations (1) through (3) 
describe how the LAMP fault simulator is implemented. 

An alternate and more detailed implementation can be achieved by 
associating two fault lists with each gate whose true value is 3. The 
fault lists contain those faults that will cause the faulty gate output 
to be logical k for k = 0, 1. This allows more detailed analysis of 
faulty circuit behavior during initialization. However, this approach 
will significantly increase the storage required for the fault lists and 
the CPU time required to perform the simulation. For that reason, 
eqs. (1) through (3) were chosen as a realistic compromise between 
detail and cost. 


4.2 Fault-simulator race analysis 


Race analysis under fault conditions (Step 3 in Fig. 1) is performed 
on the basic NAND and Nor flip-flop (Fig. 2). An analogous situation 
to the true-value race can occur because of faults; that is, because of 
one or more of the faults in the fault list on gate C or D (Fig. 2). Each 
hard fault f in a fault list on gate G means that if f physically exists 
in the circuit, then the true value of G will be complemented. There- 
fore, the behavior of faults is identical to the behavior of true values 
in the faulty circuit. Then with some modification, the algorithm for 
detecting true-value races can also be used to detect fault-induced 
races. A fault f on the output gate(s) of a flip-flop (FF) is a race fault 
(star fault) if it satisfies all of the following conditions: 


(1) Fault f will cause both outputs (D and C) of FF to be 0. 
(2) Both gates of FF are scheduled to change at the present incre- 
ment of simulation time. 
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(3) Fault fis not: 


(a) The input open on D from C or the input open on C from D. 
(b) The output of C SA1 or SAO. 
(c) The output of D SAI or SAO. 


The first two conditions are the same as the conditions for a true- 
value race. The third restriction is apparent since, if either of the cross- 
coupled inputs were open, the gates would not form a flip-flop and 
could not race. Similarly, either output SAl or SAO would make a 
race impossible since there is no uncertainty about the outcome. As 
with the true-value race, faults which force C = 0 and D = 3 or 
C = 3 and D =0 will cause oscillations and are declared as race 
faults. 

Let F¢ and Fp be the set of faults (or the fault list) on C and D, 
respectively. Let Fr represent the set of faults that cannot cause a 
race on IF [those faults listed in condition (3) above ]. Consider three 
cases : 


Case 1: C = 1 and D = 1; then the race faults Fe are given by: 
Fr = (Fe Qf) Fo) 0 Fr. 

Case 2: C = 1 and D = 0; then the race faults Fr are given by: 
Fr = (Fc 0 Fp) 0 Fy. 

Case 3: C = 0 and D = 1; then the race faults Fr are given by: 
Fr = (Fp 0 Fe) 0 Fr. 


The faults in the set Fz are the star faults. These star faults are then 
merged into the fault list on gates C and D. That is, 


Fo — (Fc 9 Fr) U *Fr 
Fp — (Fp 09 FR) LU *Fr, 


where F¢ and Fp are the fault lists on gates C and D, and Fz is the 
fault list produced by race analysis. The left arrow (<-) means “‘is 
replaced by.”’ The new Fc and Fp are assigned to gates C and D 
at the same time the other new output values are assigned to their 
gates. 


4.3 Fault-simulator oscillation analysis 


A fault oscillation (Step 3 in Fig. 1) is declared if the circuit does 
not stabilize after N increments of simulation time and no true values 
are changing. The number N may be set by the user as described 
earlier. 
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If a fault oscillation is detected, the old and new fault lists for each 
gate in the set {G., 7 = 1, 2, ---, m} whose inputs changed during 
the previous increment of simulation time are compared. Let F,; = new 
fault list and F,; = old fault list for some gate in {G;}. Then the set 
of faults that are causing the fault-list changes, F., is determined as 


Foe = Ufa 


and 

Fy; = (Fni9 Foi) U (Foi ® Fai). 
Since single faults are assumed, no fault can cause another fault to be 
in a fault list. Therefore, the set of faults that alternately appears and 
disappears in the fault lists must be causing the oscillation. The set 
of faults causing the oscillation, F., is flagged as star faults (or unioned 
as star faults) in the new list F,,;. That is, 


Paice (Pui9 Fei) U *F si. 


Once a true-value or fault oscillation has been detected, oscillation 
analysis is performed until the circuit has been stabilized. By adding 
star faults or adding the value 3, the circuit should eventually stabilize 
and the oscillation will be resolved. 

Figure 5 shows a circuit that illustrates both true-value and fault-list 
oscillations. If K1 = 1, then the circuit will oscillate in true values. 
However, if K1 = 0, the input-open fault from K1 on gate K8 will 
cause the circuit to exhibit a fault-list oscillation. 

Since the calculations involving the star faults are expensive, a 
simulator is available (logic simulator) that immediately terminates 
simulation of any star fault when it occurs. Thus, the logic simulator 
does not simulate the effect of faults that cause ‘‘don’t-know” condi- 
tions. This approximate simulation yields faster simulation times. 


V. OTHER LAMP SIMULATORS 

Sections I through IV of this paper explain the fundamental ideas 
behind logic-circuit simulation in LAMP. In this section, a brief 
description of the shorted-fault simulator and the functional simulator 





Fig. 5—Oscillating circuit. 
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is presented. The overall operation of the shorted-fault and functional 
simulators is similar to the operation of the simulators presented 
earlier. The fundamental difference lies in the method used to compute 
the output of the gate or functional element. Therefore, only the basic 
differences are discussed here. 


5.1 The shorted-fault simulator 


The shorted-fault simulator uses the deductive technique to simulate 
the effect on a logic circuit of a single electrical short between two 
gate outputs, where logical 0 is assumed to dominate logical 1. That is, 
if two gates, A and B, are shorted together and (in the absence of the 
short) A has the value 1 and B has the value 0, then in the presence of 
the short, gate A will have its output forced to logical 0. An option 
is available that causes logical 1 to dominate logical 0; however, since 
both cases are similar, only the case dominated by logical 0 is de- 
scribed here. 

The shorted-fault simulator is a recent addition to the LAMP 
system. Because run time was expected to be considerably longer than 
for the fault simulator, the shorted-fault simulator was implemented 
to detect and immediately terminate simulation of all star faults. 

The operation of the shorted-fault gate calculation requires that two 
fault lists, the constrained and free fault lists, be associated with each 
gate. The free fault list for gate A, called F4, is computed using eqs. 
(1) through (8). The constrained fault list on gate A, called Ca, 
reflects the effects of the signals on any gates that can short to gate A. 
For the computation of the constrained fault lists, consider two gates, 
A and B, and a fault, s, whose occurrence causes the output of gate 
A to be shorted to the output of gate B, as shown in Fig. 6. Consider 
two cases: 


() fTA=B=1, 
Ca—CaU {sf) (Fa U Fs)} (4) 
Ce—CeU {sf (Fa U Fa)}. (5) 
(i) If A = land B = 0, 
Ca—CaU {80 (Fz 9 Fa)} (6) 
Ca<Cz0@ {80 (F298 Fa)}. (7) 


The initial constrained fault list on each gate is exactly the free fault 
list on that gate. The constrained fault list is then altered as described 
in eqs. (4) through (7). These equations can be verified by examining 
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Fig. 6—Two shorted faults. 


all eight cases since the only difference between the old and new 
constrained fault lists is fault s. This procedure must be repeated for 
every shorted fault that can affect the output of gates A and B (such 
as fault t in Fig. 6). However, since only one fault is assumed to exist 
at any time, all the applications of eqs. (4) through (7) are independent. 

The constrained fault list on gate G is the ‘‘true” fault list for the 
gate since it reflects the effects of potential shorts to gate G. The 
free fault list on gate G is used as the starting point to compute the 
constrained fault list. If the free fault list were discarded after use, 
it would be necessary to go to the inputs of G and recompute the free 
fault list on G wherever it was necessary to derive a new constrained 
fault list on G (e.g., when a gate that could be shorted to G changes 
values). 

The timing considerations are also important. Since the inter- 
connecting paths are assumed to have zero time delay compared to 
the time delay of the gates, the effect of any shorted fault must 
immediately be reflected at the outputs of the gates, which may be 
shorted together. Therefore, the effect of possible shorts on each gate 
in the current timing list must be considered when the new output 
values are assigned to the gates (Step 4 in Fig. 1). The effect of the 
shorted faults may cause gates other than those in the current timing 
list to change value at the current time. This factor must be considered 
in Step 6 of Fig. 1 when the gates whose output value changed are 
scheduled into future timing lists. 
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The shorted-fault simulator has helped improve the manufacturing 
tests for circuit packs by aiding the design of sets of test inputs that 
will detect all shorted faults. 


5.2 The functional simulator 


The functional simulator allows the simulation of higher-level func- 
tional elements, such as clocks, registers, and memories, in conjunction 
with gate-level simulation. Thus, the functional simulator can be 
used to evaluate the tentative design of a logic circuit where the 
entire circuit is described as functional elements. Alternatively, func- 
tional memories, registers, and clocks can be added to a gate-level 
simulation to provide more complete or more efficient simulation of 
certain blocks by reducing storage requirements and execution time. 

The control and data flow within the functional block are described 
using an “‘Algol-like’” language.” Control conditions are described 
using “‘if-then-else’”’ statements. Data transfer is accomplished 
using “Assignment” statements. Such operators aS NOT, AND, OR, 
ADD, SUBTRACT, and SHIFT are allowed. Timing information is con- 
veyed by preceding a statement with an ‘at time” clause. These 
statements are compiled into an extended reverse Polish format and 
executed during simulation. 

The functional simulator has significantly increased the capabilities 
of the LAMP simulators because of the ease of describing a functional 
unit. It has been used to aid in the logic verification of the No. 1A ESS 
Central Control." 


Vi. RUN-TIME DATA 


The logic and true-value simulators are the most frequently used 
LAMP simulators. Hence, more data are available on their run-time 
characteristics. All data shown were collected using an IBM 360, 
Model 67. 

Table I describes ten typical circuits from a computer system. Since 
there is no convenient way to measure circuit complexity, two ad hoc 
measures are used. The number of flip-flops in a circuit provides 
insight into the circuit complexity on a localized basis while the 
number (or percentage) of gates in the MSCs" provides a more global 
measure of complexity. These circuits were simulated producing the 
data shown in Table II and Fig. 7. 

Table II shows the simulator CPU time required to simulate the 
circuits described in Table I using the true-value, logic, and parallel 
simulators. The data in Fig. 7 show that the average simulator time 
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Table |— Size and complexity of sample circuits 


Cant No. of 

Circuit Gates® 

A. Serial-to-Parallel 349 
Converter 

B. Error Corrector 340 

C. Parallel-to-Serial 387 
Converter 

D. Decoder and Order 311 
Sequencer 

E. Dial-Pulse Sequencer 336 

F. Decoder and Match 383 

Circuit 

G. Arithmetic Unit 6602 

H. Core Store Unit II 9359 

I. Core Store Unit I 2476 

J. Processor 46,012 


*T?L NANDs are used throughout 
for the circuits listed. 
Data not available. 
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_ Fig. 7—Simulator time 


Percentage 
No. of No. of Gates | of Gates in 
Flip-Flops in MSCs MSCs to 
Total Gates 
90 224 0.64 
68 178 0.52 
78 184 0.47 
15 82 0.26 
22 112 0.33 
8 44 0.12 
234 4378 0.66 
320 3517 0.37 
167 1182 0.48 
2149 t f 


. There are an average of two inputs per gate 
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AVERAGE NUMBER. OF FAULTS IN LIST, Layc 
required to calculate gate output. 
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Table If — Simulation time for three simulators 


Simulation CPU Time 


No. of No. of (Seconds) 
Circuit ‘Faults Vectors 
Simulated | Simulated 


, True 
Logic Value Parallel 
A. Serial-to-Parallel 572 427 433 11 180 
Converter 
B. Error Corrector 894 412 641 9 102 
C. Parallel-to-Serial 559 348 253 9 145 
Converter 
D. Decoder and Order 886 893 352 17 135 
Sequencer 
E. Dial-Pulse Sequencer 395 254 32 5 39 
F. Decoder and Match 1065 161 43 3 52 
Circuit 
G. Arithmetic Unit 2147 377 510 39 927 
H. Core Store Unit II 2582 200 8361 330 ‘ 
I. Core Store Unit I 2631 16 326 17 495 
J. Processor 9469 134 8673 180 ai 


* Data not available. 


ta required to compute the output true-value and fault list for one 
gate (one gate calculation) is a linear function of the length of the 
average fault list Lava on that gate. The length of a fault list is the 
number of faults in the list. The time ta includes all bookkeeping and 
overhead involved in the simulation. 

Figure 8 shows more data on circuit J in Table I (the No. 1A ESS 
processor), The two lines represent the CPU time (IBM Model 67) 
per input vector for execution and read-write tests for the processor 
as a function of the number of faults being simulated. During the 
execution tests, the processor is executing instructions. During the 
read-write tests, the registers of the processor are being written and 
read by a second computer. The processor contains about 100,000 
potential classical faults. These data were collected by simulating a 
subset of the faults against a subset of the diagnostic tests for the 
processor. Main memory size (4 megabytes) limits the number of 
faults that can reasonably be simulated, since it is desirable not to 
utilize the paging features of the Model 67 virtual memory because 
of the real-time penalty incurred due to the slow drum and disc 
accesses. These curves show that simulation time increases linearly 
with the number of faults simulated for a given set of vectors. However, 
the curves also show that simulation times are highly dependent on 
the circuit function being exercised by the input vectors. 
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Fig. 8—Simulation times for typical processor diagnostics. 


Vil. HYPERACTIVE FAULTS 


A new phenomenon called hyperactive faults has been found. 
Hyperactive faults are those faults that cause an inordinate amount 
of simulation activity. Removal of the hyperactive faults has reduced 
simulation time by as much as a factor of 8. 

The fault simulator typically is more expensive to use than the 
logic simulator. However, it was discovered that on a 40-vector 
simulation of a 30,727-gate circuit with 950 faults and 152 star faults 
. (faults which cause a race at some point during the simulation), the 
logic simulator took 750 seconds of IBM 360, Model 67, CPU time 
while the fault simulator required 2290 seconds. In an effort to deter- 
mine the cause of this large discrepancy, the activity count was 
computed for each fault being simulated. The activity count for a fault 
is incremented if that fault is in the fault list on some gate at simulated 
time ¢ + 1, but not in the fault list on that gate at simulated time t. 
The activity count is a measure of the amount of circuit activity 
caused by each fault. 

Figure 9 shows a typical plot of the activity count distribution. For 
the case mentioned above, there were 14 faults whose activity count 
was more than 16 times the average activity count for all faults. These 
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Fig. 9—Fault activity count distribution. 


14 faults were removed and the circuit was resimulated with the fault 
simulator in 695 seconds. Thus, the removal of 1.5 percent of the 
faults caused approximately a 3-to-1 improvement in simulation 
CPU time. 

Simulator speedups as high as 8 to 1 have resulted from the elimi- 
nation of faults whose activity counts were excessive. For example, 
in a 29,696-gate circuit with 642 faults, the run time was 170 seconds 
per vector. By removing only 14 hyperactive faults (whose activity 
count was greater than 16 times the average), the run time dropped 
to 21 seconds per vector for the same vectors. 

Two more simulations are of interest. For a 30,727-gate circuit with 
1400 vectors and 2318 faults, including 601 race faults, 341 hyperactive 
faults were removed. On the same circuit with 3500 vectors and 5079 
faults, including 1789 race faults, 101 hyperactive faults were removed. 
Thus, the number of hyperactive faults detected is reasonable. 

Hyperactive faults are typically associated with clock circuits, 
sequencer circuits, and “stop circuitry.’”’? The occurrence of a hyper- 
active fault in the circuit often removes the effectiveness of critical 
control leads and causes the circuit to “run wild.’’ While the hyper- 
active faults cause erratic circuit behavior, they do not necessarily 
cause fault-list oscillations. 

The removal of hyperactive faults produces the most dramatic 
effect in the fault simulator because hyperactive faults are usually a 
subset of the star faults (race and oscillation faults) discarded by the 
logic simulator. Thus, the logic simulator is not as sensitive to hyper- 
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active faults. The removal of hyperactive faults from the fault simu- 
lator produces a significant saving in computer resources. 


Vill. SUMMARY 


The emphasis in the LAMP simulators has been to provide a 
reasonable level of simulation detail in a cost-effective manner. To 
achieve this goal, several simulators have been produced, each of 
which emphasizes some aspect of the cost-versus-detail tradeoff. The 
true-value simulator provides economical simulation of large logic 
circuits by using a true-value, integral-delay, gate-level circuit model. 
The timing simulator is somewhat more expensive since it analyzes 
minimum and maximum rise and fall delays for each gate as well as 
performing spike rejection in a gate-level, logic-circuit model. The 
logic simulator provides a two-value fault simulation using the deduc- 
tive method, and a gate-level circuit model. The fault simulator is 
identical to the logic simulator except that it provides a three-value 
fault simulation. As a result, the fault simulator is more expensive 
than the logic simulator. Clearly, both fault simulators are more 
expensive than the true-value simulators. 

The LAMP system has been used throughout Bell Laboratories to 
aid logic-circuit design and analysis. LAMP, and in particular the 
simulators described in this paper, have been very important in the 
development of the ESS 1A Processor and the No. 4 ESS."! Because 
of the depth of the simulation capabilities available, LAMP has 
provided efficient simulation capabilities over a wide range of circuit 
sizes and device technologies. 
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LAMP: 


Automatic Test Generation for 
Asynchronous Digital Circuits 


By S. G. CHAPPELL 
(Manuscript received February 28, 1974) 


An automatic test generation system has been developed to detect faults in 
combinational and sequential circuits. The circuit model treats logic cir- 
cutts as interconnections of unit- and zero-time-delay gates. A series of 
time-dependent Boolean equations are derived from the logic network 
(starting from the network inputs) in terms of sequences of signals (input 
vectors) on the circuit input leads. These equations account for the effect 
of specific circuit faults. Many tests, each consisting of a sequence of 
input signals (input vectors), are needed to detect all single faults in a 
circutt. Tests are generated from the time-dependent equations using two 
different strategies: (14) a maximum-cover approach to detect a large 
number of faults quickly by generating tests for the faults on the circutt- 
input leads. The fault-detection level achieved by the maximum-cover tests 
as then evaluated using fault simulation; (21) tests for indiidual faults 
not detected by the maximum-cover approach. ATG has been implemented 
on the IBM 860, Model 67, and IBM 370, Model 168, computers. 


1. INTRODUCTION 


The automatic test generation system (ATG) was designed to 
provide fault-detection tests for single stuck-at faults in combinational 
and sequential circuits. Since this problem has essentially been solved 
for combinational circuits,-* this paper concentrates on aspects of 
automatic test generation for sequential circuits. 

The ATG algorithms presented attempt to account for actual 
circuit behavior as closely as possible. Hence; it is necessary to create 

1477 


a computer model of the actual gates in the logic circuit. The circuit 
description used by ATG will utilize a unit/zero time-delay model, 
where a gate can assume one of three values: logical 0, logical 1, and 
don’t-know X. This model has been widely used for logic-circuit 
simulation.*:> Because the test-generation algorithms described use the 
same model as many simulators, there are parallels between the 
simulation and test-generation techniques. These result from the effort 
to increase the accuracy of test generation to achieve the accuracy of 
current simulation techniques. 

The major drawback of previous algorithms*-® for test generation 
for sequential circuits is the lack of a satisfactory model for the 
sequential circuit. Previous algorithms use either the Huffman model 
or an iterative combinational circuit model for sequential circuits. 
While these models are mathematically convenient, they are hardly 
accurate representations of real logic circuits. The system to be 
presented here has the following features: 


(t) Requires no identification of feedback lines. 

(it) Allows gates to have time delays associated with their response 
to input stimuli. 

(12t) Resolves races on flip-flops and detects circuit oscillations. 

(zv) Assumes that an unknown circuit state corresponds to each 
gate having the unknown value X. [The X corresponds to 
value 3 in Ref. (5).] 

(v) Generates a test for a single stuck-at or open-gate input fault, 
if it exists. The test is guaranteed to detect the fault (subject 
to the circuit-model assumptions). 

(vt) Handles gate-level models of sequential circuits containing up 
to approximately 1000 gates. 


For economy, the system allows test generation using two strategies. 
The first strategy (maximum cover) generates a set of tests designed 
to detect a large number of single faults without ever explicitly con- 
sidering a specific fault. The second strategy generates tests for speci- 
fied single faults. To allow rapid evaluation of the set of tests derived 
by the first strategy, a fault simulator is needed to simulate all single 
stuck-at faults. This simulator identifies the undetected set of faults 
that must be considered by the strategy-2 test generator. To keep the 
computation time reasonable, a user-specified parameter sets the 
maximum sequence length that will be considered by the system. The 
use and operation of the system is shown in the flow diagram in Fig. 1. 
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GENERATE TESTS DETERMINE GENERATE TESTS 
FOR WHICH FAULTS FOR REMAINING 
—r ALL FAULTS” WERE DETECTED FAULTS END 
(STRATEGY 1) (SIMULATE) (STRATEGY 2) 


Fig. 1—Overall strategy. 


Il. MATHEMATICAL BASIS 

This section builds the framework for the remainder of the paper. 
The behavior of some gate G will be described by two equations G® and 
G!, where G® (G!) describes the input conditions that set gate G = 0, 
(G = 1). The gates can assume one of three logical values: logical 0, 


logical 


1, or the don’t-know value X. Equations G and G', however, 


are strictly Boolean equations in that the constituent variables of G° 
and G! can assume only values of 0 or 1. Similarly, G° and G' are 
Boolean variables. 


2.1 Definitions 


The 
(2) 


(2) 


(iit) 
(iv) 


following definitions are used in this discussion. 


Input vector: A string of n logical values (0, 1, and X, where 
X is a don’t-know value) that applied to the n corresponding 
input leads of a circuit. The effect of these values is allowed to 
propagate through the circuit before the next input vector 
is applied to the circuit. 

Test: A series of input vectors applied in a specific order to 
the circuit inputs. A test is also sometimes called a sequence. 
The first vector in each test assumes the circuit is in a com- 
pletely unknown state. The nth vector (VN > 1) assumes the 
state produced by the preceding N — 1 input vectors. Many 
tests may be required to detect all of the detectable faults in a 
logic circuit. Notice that it is not necessary to allow the circuit 
to stabilize between input vectors. 

Sequence length: The number of input vectors in a test. 
Input variables: Associated with each circuit input lead a are 
two binary input variables a° and a!. The variables a° and a} 
can each take on Boolean values 0 and 1 (or “false’’ and 
“true”’). Together, a° and a! define the logical value (0, 1, or X) 
of input lead a as shown in Table I. Hence, if a® = 1 (disallow- 
ing a° = a! = 1), then the logical value of lead a is 0. If 
ai = 1, then the logical value of lead a is 1. If neither a? = 1 
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(v) 


(vt) 


(vit) 


Table | — Definition of a° and a’ 


a a° Lead a 
Value Value Logical Value 

0 0 x 

0 1 0 

1 0 

1 1 Impossible 


nor a! = 1, then the value of input lead a is unknown or X. 
It is clearly impossible for input lead a to simultaneously have 
a logical value of 1 and 0. Therefore, a® = a! = 1 is an im- 
possible situation. The variables a° and a! will often be used 
as an ordered pair (a!, a°). For example, (1, 0) represents the 
gate value of logical 1. 

Sequence of input variables: Let a% (a), ¢7=1, 2, ---, 
represent the fact that a = 0 (a = 1) during the 7th input 
vector of a sequence. If no subscript is used (e.g., a° is written), 
then it is assumed that a represents the first input vector. 
Notation: As is traditional, ‘‘+’’ represents logical or, and 
‘“‘.”? represents logical AnD. The symbol ‘’’ will be used to 
represent NoT or complement. 

Unknown state: If the circuit is in an unknown state, it is 
assumed that each gate in the circuit is assigned the unknown 
output value X. 


2.2 Properties of the equations 


Some of the properties of input variables a® and a! are described in 
this section. Consider a circuit consisting of a two-input AND gate c 
with inputs a and b. Input leads a and b have associated with them 
(a', a°) and (b', 6°), respectively. The problem is to compute (c’, c°). 
The usual truth table for AND is shown below. 


a 
AND 0 1 xX 

0 0 0 0 

 b 1 0 1 x 
xX 0 x x 


Translating this to the ordered-pair notation, we have: 
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(a3, a°) 


AND (0, 1) (1, 0) (0, 0) 
(0, 1) (0, 1) (0, 1) (0, 1) 
(bt, 6°) (1, 0) (0, 1) (1, 0) (0, 0) 
(0, .0) (0, 1) (0, 0) (0, 0) 


Examining these ordered pairs, one finds that c° = 1 if and only if 
(iff) a = 1 or b° = 1. Similarly, c! = 1 iff both a! = 1 and Bb! = 1. 
Hence, the following relations hold for a two-input AND gate ¢ with 
inputs a and b: 


© = q+ 
cl = q!-bl 
or 
(cl, c) = (a!-b!, a? + 0B). (1) 


It is important to note that c® is not necessarily the complement 
of cl. For example, if (a!, a°) = (0, 0) and (6', b°) = (1, 0), then 
(cl, c’) = (0-1, Orr 0) = (0, 0). 

A similar set of relations can be derived for a two-input or gate f 
with inputs d and e. 

(f1, f°) = (a + et, d?-e%), (2) 

The interpretation of this is that f = 1 if either d = 1 or e = 1 or 
both. Similarly, f = 0 if both d = 0 ande = 0. 

Another relation can be derived for the nor gate (or inverter) h 
with input g as follows. Note that the complement of X is still X. 


(hi, h°) = (9°, g'). (3) 
For later use, the relations governing the NAND gate are also presented 


here. The NAND gate is simply an AND gate followed by a Nort gate. 
Hence, we have, for a two-input NAND gate w with input y and 2: 


(wi, w®) = (y? + 2%, yr2). (4) 
The above definitions have been presented for two-input gates. 
However, since the functions AND and or are associative, the equations 


for an N-input gate can easily be derived. For example, for a three- 
input NAND gate w with inputs p:, y, and z, we have: 


Gb) eS poy eye) (5) 
Notice that since a® and a! are binary variables, they obey all the laws 


of Boolean algebra. However, the interactions of a°® and a! are not so 
obvious and are of interest here. 
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It is significant that in the algorithms presented for computing the 
output equations for a gate [eqs. (1) through (5) ], we have never 
produced a G°, G!, or G expression, where G is any gate in the circuit. 
This has pecaried for two reasons. First, because gate G can assume 
three values G is not particularly useful. For example, if G = 1, then 
G = 0+ X. Second, as a practical matter, the computation of G° or 
G, given G° or G', is quite time consuming if both input and output 
are to be in sum-of-products form. 


2.3 Some identities and nonidentities 


After the operations AND, oR, and nov have been defined, further 
properties can be investigated. By simple examination of the definitions 
for AND, oR, and not, the following identities are obvious. Let a 
represent any gate in the circuit. For ease of understanding, the 
corresponding theorem of Boolean algebra is written on the same 
line as the identities but enclosed in brackets. 


(z) (0, 1)-(a', a) = (0,1) [0-a = 0]. 
(72) (1, 0): (a1, a®) = (a, a°) [1-a = a]. 
(272) (a, a’) i (a), a’) s, (a, a’) La-a = a]. 
(wv) (1,0) + @, a) = 1,0) —Lt+a= 1]. 
(vy) (0,1) + (@, a) = (ta) [O+a=a]. 
(vt) (a', a) + (a}, a°) = (a, a) [at+a=al). 
(vit) (a1, a°)- (b', b°) = (b!, b°)-(a!, a®) [Commutative ]. 
Proof: (a, a°)-(b!, 6°) = (a!-b!, a® + 6°) 
= (b!-a!, bo + a®) = (b', b°)-(a!, a®) QED. 
Similarly, 
(viit) (a, a) + (b!, b°) = (6, b°) + (at, a®) [Commutative]. 
(iz) as a°) . (0, b°) ]- (cl, c°) = (a', a°) ‘ (6, b°) ° (ot, c°) 
[Associative ]. 
Proof: [(a, a°) . (0, b°) J: (ce, c°) 
= ([at-b*]-c4, [at + 0°) + 2) 
= (a [bo], a + [0° -+ 0°) 
= (a', a’) ‘LO, b°) ° (cl, c°) QED. 
Similarly, 
(x) L(a’, a°) + (61, b°)] + (c, ©) 
= (a, a) + [(b!, b°) + (ec, &)] [Associative ]. 
(xt) (a', a®)- (61, b°) + (a', a) = (a, a) [ab+a= a]. 
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Proof: (a, a°)- (b!, b°) + (a!, a°) 
_ (a)-b!, q® + b°) + (a}, a°) 
= (a}-b} + a, [a® + b° ]-a°) 
= (a), a® + ad) = (a!, a®) QED. 
(ati) a°-a! = a)-a® = 0. 
This is obviously true if a is a circuit input lead. Because the 
computation of the equations proceeds from gate inputs to 
gate outputs, this result can be shown inductively. For any 
valid circuit state (gates have logical values 0, 1, or X), the 
theorem is true. It is also true for input leads, as mentioned 
earlier. Then, by examination of eqs. (1) through (5) above, 
we see that the relationship is preserved when the new gate 
output equations are computed. Hence, by induction, it 
follows that the relationship holds for every gate in the circuit. 


(xeiz) [(a', a®)- (61, 6°) ] = (a°, at) + (6°, b) [(a-b) = a + 5]. 
Proof: [((a!, a°)-(b}, 6°) ] = (a!-b!, a® + 6°) 
= (a° + 6°, a'-b') = (a°, at) + (6°, BY) QED. 
(xiv) [(a’, a®) + (01, 6°) ] = (a, a4)- (6°, bY) [(a + b) = G-b]. 
Proof: [(a', a®) + (b', b°)] = (a! + b!, a®-b°) 
= (a®- 5°, a) + b') _ (a°, a) . (b°, b1) QED. 
Again, it is clear that identities (a77z) and (xiv) can easily be 
extended to several variables (e.g., (a-b-c) =a+6+ 0). 
These are simply DeMorgan’s theorems. 











The identities above simply follow the Boolean algebra. The follow- 
ing set of nonidentities results primarily from the three values used to 
model the gate behavior. 


(7) &@ +a 41. 
Proof by example: (a', a®) = (0, 0). 
Clearly, 0-0 ¥ 1. 
This is not unexpected since the only relation between a° and 
a} is that a°-a! = 0. 
(iz) (a!, a°)-(b}, 6°) + (a, a°)- (0°, b') ¥ (av, a) [a-b+a-b¥a]. 
(iit) a-c+4@-b-c¥#bc+a-e. 
(iv) ab+aéct+b-c¥ab+a4-c. 
Nonidentities (27), (277), and (7v) are easily proved by examining 
the truth tables, where the variables are allowed to assume three 
values : logical 0, logical 1, and the don’t-know value X. 
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It is interesting to note that if we required the circuit input leads 
to have only values 0 and 1, the system presented here would reduce 
to Boolean algebra with a° = d' and a! = dG. This is a reasonable 
restriction, since we could always require that any X values generated 
for the input leads be arbitrarily set to logical 0 or 1. However, it 
would then be necessary to treat input leads differently from other 
gates in the circuit, since it is clearly not possible to force every gate 
in the circuit to a known value (logical 0 or 1). Hence, the generality 
of allowing circuit input leads to assume the value X is retained in 
this paper and all gates are treated identically. 


iil. EQUATION DERIVATION FOR LOGIC NETWORKS 


The operation of the ATG has two well-defined steps. The first 
step is to derive a set of relations (equations) that represent the 
behavior of the logic circuit. The second step is to derive a set of tests 
for the circuit based on the equations derived in the first step. In this 
section the equation-derivation process is described. 

The equation derivation process essentially reduces the behavior of 
a logic circuit to a series of equations. Hence, this reduction process is 
quite critical. These equations must reflect the true circuit behavior as 
closely as is possible (or economical). This means that the time delay 
of gates must be accounted for during the equation-generation process. 
The equation-generation process will first be presented using a fault- 
free, unit/zero, time-delay model for each gate. The model will then 
be extended to account for single stuck-at-one and stuck-at-zero faults. 

The method is essentially a dynamic equation-generation process 
that determines exactly those input sequences that will force each 
gate to a logical 0 or 1 at each instant of time. The equation-derivation 
process begins with the circuit inputs and continues through the circuit 
until the equations are stable; that is, until the output equation on 
each gate is consistent with the input equation on that gate. The 
equations are derived in terms of circuit input variables only; no 
feedback lines need be identified. The input variables may change 
several times before the circuit finally reaches a stable state. 

Since the objective is to generate tests to detect faults in a circuit, 
the result of this process will be a series of logical values 0, 1, and X 
(don’t know) to be applied to the input leads of the circuit. The output 
of the circuit will then be observed to determine which classical faults 
have been detected. That is, the output of the real (perhaps faulty) 
circuit will be compared to the expected result to determine if the 
real circuit is performing correctly. 
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3.1 Fault-free-equation derivation 


In this section, the problem of generating equations that represent 
the behavior of the fault-free circuit is discussed. Because certain 
simplifications are possible, the equation-derivation process for com- 
binational circuits is discussed first. This is followed by the equation- 
derivation process for sequential circuits. 


3.1.1 Equation derivation for combinational circuits 


The derivation of the fault-free equations will be considered here. 
Consider the NAND gate G shown in Fig. 2. If we assume both inputs 
to the gate are circuit inputs or other gate outputs, then we have: 


G® = Al.B 
G = Ao + Bo, 


Equation G® denotes exactly those input conditions to gate G that 
force (or set) gate G to logical 0. Implicit in this equation is the unit- 
delay assumption. If inputs A = 1 and B = 1 are applied at time ¢, 
then the output of G is forced to logical 0 at time ¢ + 1. A similar 
situation exists for G!. Either A = 0 or B = 0 (or both) applied at 
time ¢ to the inputs of G forces its output to be logical 1 at time t + 1. 
This is similar to eqs. (1) through (5) in the previous section, except 
that the element of time has been added. For most gates, the output 
of the gate responds to the input stimuli one unit of time later. The 
gates with one unit of delay are ‘‘real” gates, e.g., those containing an 
active semiconductor device. 

In some logic families it is possible to directly connect two (or more) 
gate output leads together. This connection (called a tic here, for 
tied collector) performs a logic function. If the ground level is logical 0, 
then the Tic function is AND. If the ground level is logical 1, then the 
tic function is or. The TIcs may be considered zero-delay gates except 
for the wire-propagation delay, which is not considered here. 

If computation begins at the circuit inputs, which are assumed to be 
applied at time ¢, the output of each gate driven by a primary input is 
reevaluated and the new equation is assigned to the gate output at 
time ¢ + 1. Every gate whose input equation changed at time ¢ + 1 
is reevaluated and its new output is assigned at time ¢+ 2. This 
process continues until the computation reaches the circuit output 


(A9, Al) G GO = Al. Bl 
(BO, Bt) G1 = Ad + BO 
Fig. 2—Equations based on input equations. 
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gates. At any time +7 after the processing begins, the equations 
denote those input conditions that force each gate to logical 0 and 1 
at 7 gate delays after application of the vector. In particular, when the 
circuit has settled to a stable state, the input values that set each gate 
to logical 0 or 1 are specified. Notice that no assumptions have been 
made that would preclude the application of this argument to sequen- 
tial circuits. 

The similarity between this procedure and the actual propagation 
of electrical signals through the circuit should be evident. In both 
cases, the input stimuli are applied to the circuit inputs and are allowed 
to propagate through the circuit. 


3.1.2 Equation derivation for sequential circuits 


Three points are significant in the discussion of equation derivation 
for combinational circuits: (7) the process assumes all gates have 
either unit or zero delay, (727) the process starts from the circuit inputs 
and proceeds through the circuit much as a signal would propagate 
through the circuit, and (777) the equations define, for each time ¢ + 7, 
exactly those input conditions that cause each gate in the circuit to 
be forced to logical 0 and 1 at that time from the specified initial 
state. Again, there are no assumptions that limit this technique to 
combinational circuits. 

The primary addition, which must be made to allow the same 
algorithm to be applied to sequential circuits, is some provision for 
deciding when to stop the computation. For combinational circuitry, 
the computation stops when the circuit outputs are reached. However, 
this is not satisfactory for sequential circuits. The equation derivation 
yields G° and G! for each gate G@ for each time ¢ + 7. If both G° and 
G' at time ¢ + 7 are equal to G and G! at time t + 7+ 1, the gate is 
in a stable state. Otherwise, each gate driven by gate G must be 
reevaluated since G changed values (output equations). A detailed 
flow chart of the equation computation process will be presented later. 

Let a.z represent the value of circuit input lead a during the ith 
vector of the test (or sequence). Similarly, a (a2) means make input 
lead a logical 0 (logical 1) during the 7th input vector of the sequence. 
The first vector in each sequence is number one. If the sequence 
number is missing, then it is assumed to represent the first vector of 
the sequence. An example of the application of this algorithm is 
shown in Fig. 3 where the equations for a flip-flop are calculated. Time 
runs down the page. The flip-flop is assumed to start from the unknown 
state since F° = F! = G? = G' = 0. The inputs are assumed to be 
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nae Fo Fl TIME 
(a9, a!) 
0 8) t 
0 a? t+] 
al . b0 a? t+2 
al. bO a0 + gO. bl = QO t+3 
Go G! TIME 
0 0 t 
(b9, b1) 0 bo tt+1 
a0. bl bo t+ 2 
a®. bl bO tal. bo = bo 1+ 3 


Fig. 83—Equations for NAND flip-flop from an unknown state. 


applied at time ¢. At time ¢ + 1, only F! and G! changed values so at 
time t + 2 only G and F® are calculated. At time ¢ + 3, none of the 
new output equations changed, so the circuit is stable and computa- 
tion stops. ; 

A similar computation can be carried out if the circuit is in some 
known initial state. This is illustrated in Fig. 4 where the circuit is 
initially set at F = 0 (Fo = 1, F! = 0) andG=1 (@=0, G@ = 1). 

The above procedure finds the next state function for a combina- 
tional or sequential circuit. That is, given a circuit state (possibly 
unknown), we can find all possible next states resulting from the 
application of one input vector. 


FO Fl TIME 
(a9, a!) a 
F 
1 0 t 
al ad t +1 
al a? t+2 
al a0 + a2. bl = ad t+3 
oo ot TIME. 
0 1 t 
0 1 
ibe Be) 0 1 t+ 
a® . bl al + pO t+ 2 
ao. bl al + b0 to 33 


Fig. 4—Equations from F = 0, G = 1 state. 
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Clearly, the problem is to determine the state of the circuit as the 
result of each possible sequence of vectors. If this can be done, then 
there is no need to select a particular next state since they are all 
considered simultaneously. A method for doing this is described in 
the next section. 


3.1.3 Sequence derivation for sequential circuits 


The algorithm for sequence derivation is based only on the input 
behavior of the circuit. There is no need to consider any feedback 
variables. Again, the derivation assumes either unit- or zero-delay gates. 

This algorithm is based on the explanation presented in the previous 
section. The derivation proceeds as follows for a sequence of length M. 


(t) To the circuit inputs [a, b, c, ---] = I apply the variables 
[a*1, b*1, c*1, ---] = J.1 (a* means apply a! or a®) and derive 
the equations for the circuit (starting from any initial state). 

(iz) Let 7 = 1. 

(477) From the circuit “state,’’ as defined by the application of the 
input vector of variables J.7 = [a*j, b*7, c*j, ---], apply the 
input vector of variables J.7 -++ 1 and propagate these variables 
through the circuit, i.e., derive the ‘‘equations” for the circuit 
in terms of J.j + 1 and J.k for all k < j. The effect of J.7 
need not stabilize before applying J.7 + 1. 

(iv) If 7 < M, then let 7 = 7 + 1 and go to step (zz). Otherwise, 
exit. 


This procedure models the behavior of a logic circuit. The input 
stimuli (variables) are applied to the inputs of the circuit and allowed 
to propagate through the circuit. The input vector J.1 assumes the 
circuit is in some initial state, which is probably unknown. Input 
vector J.2 produces equations from the initial state produced by J.1. 
In general, the vector J.j starts from the state produced by all J.k 
where k < j. 

After application of J.j, the effect on the circuit of any sequence of 
j vectors is known. This is obvious since we have already shown that 
the application of J.1 from any state produces the equations G° and 
G for every gate in the circuit as a result of J.1. This extension makes 
the initial state for J.7, 7 2 2, a function of all 7 — 1 vectors. 

An application of this algorithm is shown in Figs. 3 and 5 for the 
NAND flip-flop. Here I.j = (a*j, b*7) and the sequence derivation is 
carried out for sequences of length 2 or less. Figure 3 represents 
sequences of length 1. Figure 5 represents sequences of length 2. The 
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FO F1 TIME 


(al2, a92) — es 
alt. b04 a4 t+3 
al2teT™ a1. bl1 + a02 t+4 
aly. al2. b01 +al2.~ bO1. p92 RACE 
al]. al2. b01 + al2. b02 a1. bl. bl2 + a02 t+5 
co gt TIME 
a91. bl1 b1 t+3 
(b12, b°2) letty” all. bO1 + b02 t+4 
a91. b11. bI2 + a01. a02. b12 RACE 
a91. b11- b12 + a02.~ bl2 alt. al2. b01 + bO2 t+5 


Fig. 5—Result of second vector of sequence. 


computation for sequences of length 2 in Fig. 5 begins from the final 
state of the computation for sequences of length 1 shown in Fig. 3. 
To illustrate the interpretation of the equations, consider the final 
state of G® = a°2-b!2 4+ a°1-b!1-b!2 shown in Fig. 5. This means that 
the sequence of length 1, a = 0 and b = 1 (for a°2-b'2), or the sequence 
of length 2, a = 0 and b = 1 followed by a = X and b = 1 (for 
a°l -b11-b12), will set gate G to logical 0. 


3.1.4 Equation race analysis 


A race occurs on the simple two-NAND-gate flip-flop shown in Fig. 3 
when the output state of the flip-flop is unpredictable from the input 
conditions. Under these circumstances, the outputs of the flip-flop 
must be set to the unknown value X. Let us examine F° = 6°1-a!2 and 
G® = a°1-b'2 at time ¢+ 4 in Fig. 5. Since F°-G° #0, then both 
6°1-a!2 and a%1-b!2 could be simultaneously applied to the circuit 
inputs producing the sequence a°1-b°l-a!2-b'2. This represents the 
application to the flip-flop of the sequence a = 0 and b = 0 followed 
by a= 1 and b = 1. This produces the race state (unpredictable 
output conditions) for the NAND flip-flop and must therefore be 
eliminated. The race state for our example is a = b = F = G = 1 at 
some time ¢. If a race occurs, the values computed and saved for time 
t+ 1 are F = G = 0. In addition, when unknown states are allowed, 
a race is also declared if F = 0OandG = X or F = X andG = Oat the 
same instant of time. This implies that when F = 0, G cannot be 0 
or X. Thus, to eliminate races, it is necessary to demand that if F = 0 
then G = 1 and, similarly, if G = 0 then F = 1 at the same instant 
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of time. This is accomplished in our example by forming the new 
equation F°n at time t+ 4 as FP (t+ 4) = P°¢+ 4)-G@4+ 4) 
= a)1-a'2-b91 + a!2-691-69°2. Similarly, Gn(t + 4) = a°1-a°2-b12 
+ a°1-b!2-b11. This process is called race analysis since it prevents the 
equations from causing simple flip-flops (basically, two cross-coupled 
NAND Or NOR gates) to race. 

Race analysis must be performed at time ¢ if both F° and G'! changed 
at time #, where F and G are the two gates in a simple flip-flop. While 
this result is shown here for the NAND flip-flop, the proof can easily be 
extended to Nor flip-flops. The primary difference is that F! and G! 
must be modified for nor flip-flops while F° and G° must be modified 
for NAND flip-flops. 


(t) If F° does not change, then gate F cannot change to logical 0 
at time t; therefore, there can be no race. 

(it) If G (see Fig. 5) does not change at time ¢, then F°(¢t) was 
formed by ANDing together G1(t — 1) and a!(t — 1). That is, 
Pot) = @ét— 1)-ad(it— 1). But G@é-— 1) = Gt) by as- 
sumption. Race analysis would form 


Fn(t) = F°(t)-G2() = F(t)-GX(t — 1) 
= a(t — 1)-@(t — 1)-@(t — 1) 
= a(t — 1)-@(t — 1) = F(2). 


Therefore, the new F°n resulting from race analysis is the same 
as the original F°. Then, there can be no race. 


Earlier it was shown that F°-F! = 0 for any gate F at any time. 
It is easily seen that race analysis does not destroy this property 
since, if F°(t)-F'(t) = 0 and Fr(t) = F°(t)-G'(é), then Fn(t)- F'(é) 
= Fo(t)-Gi- Fit) = 0. 


3.1.5 Equation oscillations 


It is possible that the equation computation process will never 
terminate. That is, the old equations on some gate are always different 
from the new equations on that gate. This situation is known as 
equation oscillation. If the computation described in Section 3.1.3 
proceeds through an arbitrary number (user declared) of timing lists, 
then an oscillation is declared and the message ‘‘equation oscillation” 
is printed for the user. 

An example of an oscillation is shown in Fig. 6. In general, the 
objective is to stop the oscillation by selecting a stable set of equations. 
This can usually be done by setting the new equation on an oscillating 
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Fig. 6—Equation oscillation. 


gate, say F°(t) equal to F°(t)-F°(¢ — 1). This is intended to force the 
equations on gate F to stabilize by generating equations that make 
F°t) = F(t — 1). This technique is not guaranteed to resolve all 
oscillations. 


3.1.6 Complete description of equation derivation 


The complete algorithm for generating the equations for a sequential 
circuit is shown in Fig. 7. Only two parts of the flow chart have not 
been explained previously in this section. One of these parts is the 
method of handling the zero-delay gates. The output of all the zero- 
delay gates are calculated before the next list of unit-delay gates is 
processed. These output equations are assigned to the zero-delay gates 
immediately. 

The remaining unexplained part is the initial-state pass. This pass 
simply examines the circuit and propagates forward (before the input 
variables are applied) the effect of any gates set to logical 0 or 1 and 
any faults. For example, if gate G drives gate H and gate G is set to 
logical 0, this pass determines that the output of H should be logical 1. 

This completes the description of the equation-generation process for 
fault-free sequential circuits. Next, the algorithm for generating the 
equations for a sequential circuit in the presence of a single fault is 
described. 


LAMP: AUTOMATIC TEST GENERATION 1491 











ZERO—DELAY— 
EQUATION 
COMPUTATION 






INCREMENT 
MODEL 
TIME 


CHANGE 
INPUT 
VARIABLES 

















ANY 

MORE CHANGES 

SCHEDULED 
? 









COMPLETED 
ALL SEQUENCES 


UNIT—DELAY— 
EQUATION 
COMPUTATION 


RACE 
ANALYSIS 


OSCILLATION YES 
? 
ASSIGN = 
rp OSCILLATION 


EQUATIONS ANALYSIS 






Fig. 7—Equation computation flow chart. 


3.2 Equations containing faults 


Equations for circuits containing faults may be derived in a way 
similar to those used for fault-free circuits. This method allows tests 
to be generated that can detect a specific fault. For efficiency, it is 
possible to consider several single faults simultaneously. The single 
faults considered here are the gate outputs stuck-at-one and stuck-at- 
zero as well as gate inputs open (e.g., open diode or emitter). The input 
open on a NAND or AND gate will be treated as stuck-at-one while the 
input open on a NOR or OR will be treated as stuck-at-zero. 

Let the variables x°¢ and x17 represent fault variables. Let x42 mean 
the fault x.7 is present in the circuit. Similarly, x°? means the fault is 
not present in the circuit. For the fault variables, the 7 does not 
represent the 7th vector in a sequence; rather, it represents the zth 
fault being considered. (Faults are always denoted by 2.7 and the 
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associated variables by x% or xz.) Since the single faults are assumed 
to be permanent, any fault x.7 will be present during the entire test 
sequence. The two states for x.7 allow the comparison of the faulty 
and fault-free circuit behavior to derive a test to detect the presence 
or absence of the fault in the circuit. 

Consider the gate shown in Fig. 2. The fault-free equations are 
shown. If, however, the input-open fault on gate G from A is being 
examined, then the equations for gate G are shown in Fig. 8a. It is 
possible to set gate G to logical 0 either by applying A! and B! in the 
presence or absence of fault 2.1 or by applying B! in the presence of 
fault x.1. It is also possible to set G to logical 1 by applying B° in the 
presence or absence of fault 2.1 or by applying A®° in the absence of 
fault x.1. Similar analysis for the output stuck-at-zero fault 2.3 and 
the output stuck-at-one fault 2.4 can easily be performed in the 
manner shown in Figs. 8b and 8c. 

Now assume there is only one fault in the circuit and consider the 
case where the fault propagates around a loop and returns to the 
site of the failure. If the fault is the input open on gate G from A, 
then the equations shown in Fig. 8a can be rewritten as shown in 
Fig. 9a where fault x.1 is explicitly considered and D, E, F, ---+ repre- 
sent sum-of-products equations. Computing G and G! yields the 
equations shown in Fig. 9a. Figure 9b considers the case where the 
fault exists (v!1) and Fig. 9c considers the case where the fault does 
not exist (x°1). Comparison of Figs. 9b and 9c with 9a shows that the 
computations proposed in this section for combinational circuits are 
also applicable to sequential circuits for the input-open case. 

A similar analysis can be carried out for the gate output stuck-at-one 
and the output stuck-at-zero faults. This demonstrates that the 
equations shown in Fig. 8 for handling faults in combinational circuits 
are also applicable to sequential circuits. 


X.1 ; 
A GO = Al. Bl + X11. B! 
B Gi = AO. x01 + Bo 
(a) INPUT OPEN FAULT EQUATIONS 
A X.3 GO = al. Bl + x13 A X4 GO= al. Bt. x04 
B G1 = AO. X03 + BO. x03 B G1 = AO + BO + x14 
(b) OUTPUT STUCK—AT—ZERO EQUATIONS (c) OUTPUT STUCK—AT—ONE EQUATIONS 


Fig. 8—Equations for handling faults in combinational circuits. 
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(b) PHYSICAL INSERTION OF FAULT x.1 (c) FAULT—FREE CIRCUIT 


Fig. 9—Equations for handling faults in sequential circuits. 


3.3 The halting problem 


One problem that must be discussed is how to determine when 
sequences of sufficient length have been generated. That is, given the 
equations that represent sequences of length N and the equations that 
represent sequences of length N + 1, will more information be gained 
by generating sequences of length N + 2? The question is answerable® 
if the feedbacks have been identified ; however, the maximum sequence 
length contains factors of the form 2 to the power m, where m is the 
number of circuit inputs. For 500-gate, 40-input circuits, this is an 
absurd number. 

There does not appear to be any practical method of determining 
when to halt the equation-generation process. In practice, the maxi- 
mum sequence length to be considered is supplied by the user. The 
usual procedure is to start with sequences of length 1 and increase the 
sequence length until an acceptable level of undetected faults remains 
using the test-generation schemes presented in the next section. As 
might be expected, the run time increases significantly with increasing 
sequence length such that, even if it were simple to determine when 
to halt, it would probably not be economical. In practice, the halting 
problem has presented no difficulties. It is, however, an interesting 
theoretical problem. 

In practice, the maximum sequence length required to detect all 
faults in the circuit provides some measure of the ease with which 
the circuit can be tested. The shorter the sequence length required, 
the more easily the circuit can be tested. This fact could be used as a 
circuit-design constraint by requiring that all circuits be testable with 
sequences of N or less where N is small. In fact, a 1000-gate, 11-state 
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sequencer was designed so that the flip-flops representing the state 
could be written and read directly from circuit inputs and outputs. 
This produced an easily testable sequential circuit. 


3.4 Clocked circuits 


The algorithms that have been presented allow the circuit input 
leads to be treated as variables [e.g., (a!, a°)] or as logical values 
where logical 0 is (0, 1) and logical 1 is (1, 0). It is possible to allow 
some circuit inputs to be represented by variables and others by 
logical values. Clearly, it is possible to change the logical values 
between logical 0 and 1. Then we have the ability to apply a sequence 
of logical values to an input lead. 

Tor example, suppose the circuit being considered has a clock lead 
whose normal operating waveform is 1—0-1-0 and all other input 
leads are static during this cycle. Then it is possible to apply variables 
to all but the clock lead and to supply the waveform (1, 0) — (0, 1) 
— (1, 0) — (0, 1) to the clock lead. In this way, ATG does less work 
since we have considered a sequence of length 4 on the clock lead and 
sequences of length 1 on all other leads. This is considerably more 
economical than computing sequences of length 4 over all input leads. 

In a similar way, user-specified initialization sequences can be 
applied to the circuit to place it in some desired state before allowing 
ATG to select the next input sequence. This is an effective way of 
using ATG. 


3.5 Self-initializing circuits 

Certain classes of sequential circuits are self-initializing in that, 
regardless of the initial state, the circuit always assumes a known 
state when power is applied. A simple example of such a circuit is 
shown in Fig. 10. Because the flip-flop always initializes to C = 0, 
D=1orC=1, D = 0, gate F will always be logical 0 forcing the 
flip-flop to the C = 1, D = 0 state. 





Fig. 10—Self-initializing circuit. 
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If this circuit is assumed to be in an unknown state (B = C = D 
= I} = F = X), then because X = X, the basic ATG algorithm does 
not determine the required initial state. The operation of ATG requires 
that gates be forced to some initial state by the application of input 
vectors from some initial state. Hence, self-initializing circuits require 
that the proper initial state be specified by the user. This has not 
proved to be a problem in practice since most circuits contain ini- 
tializing leads. 


IV. TEST GENERATION FROM THE EQUATIONS 


Two different schemes for generating tests are described in this 
section. The first scheme described is the generation of tests to detect 
single faults, where the equations are derived in terms of these faults. 
However, in a 500-gate, 2000-fault circuit it is not economical to attack 
all 2000 faults on a one-at-a-time basis. The second method for test 
generation is aimed at detecting large numbers of faults as easily as 
possible. It attacks the problem by essentially attempting to detect 
the stuck-at-one and stuck-at-zero fault at each circuit input lead by 
observing each circuit output lead. This is called the maximum-cover 
strategy. This scheme typically detects around 90 percent of the 
classical faults if the equations reasonably describe the circuit—that 
is, if the sequence length used is long enough. 


4.1 Generating a test to detect a fault 


To detect fault x.7, it is necessary to select a test (input sequence) 
that will force some output of the circuit to have the value k for 
k = 0, 1 in the presence of the fault x.i and to have the value k in 
the absence of fault 2.7 starting from the given initial state. Let one 
output gate G of a circuit have the following equations (by simple 
factoring) : 

@=A+ Beet + C-2% 


6 
G=D+H-2i+ F-x%, (8) 


where A, B, ---, F are also sum-of-products expressions. This means 
that the terms in A(D) are the only terms that set G = 0 (G = 1) 
regardless of the presence or absence of fault 2.7. The tests to detect 
fault x.¢ at gate G are given by B-F + C-E. This is proven as follows. 

Since the fault either exists or does not exist, v¢-x% = 0. First 
consider the case in which k = 0. Since G!(@®) represents exactly those 
conditions that set G = 1 (G = 0), then G'(a% = 1) = D+ F repre- 
sents those conditions that set G = 1 in the absence of fault 2.7. Simi- 
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larly, G(ei = 1) = A+B represents those conditions that set 
G = 0 in the presence of the fault. Hence, every condition (input 
vector) that makes the good output of G = 0 and makes the faulty 
output of G=1 is given by (A+ B)-(D+F)=A-D+4+8B-D 
+A-F+8B-F. Examination of the terms of this equation reveals 
A-D=B-D=A-F =0. Term A-D =0 because, if it were not 
zero, then there would be some term in A-D that could set 
G = land G = O simultaneously. This is clearly impossible. Similarly, 
B-D #0 (A-F #0) implies that in the presence (absence) of the 
fault, there is some term in B-D(A-F) that can set G = 1 and G = 0 
simultaneously. Therefore, any term that can set G = 0 in the presence 
of the fault and G = 1 in the absence of the fault must be in B-F. 

For the case in which k = 1, the test must be a term of (D + E) 
(A+ C) =D-A+D-C+EHE-A+E-C. By similar analysis, A-D 
= D-C = A-E = 0. Therefore, a term that sets G = 1 in the presence 
of the fault and G = 0 in the absence of the fault must be in E-C. 

Since the problem is to detect fault x.7 without regard to the output 
value of G, any term in B-F + E-C is a valid test. Therefore, all 
tests to detect fault 2.7 at gate G can be expressed as 


Detection Tests = B-F + E-C. 


If B-F + H-C = 0, there is no test that will detect fault x.7 at gate G. 
It is then necessary to examine each remaining circuit output to 
determine if v.7 is detectable. If 2.7 is not detectable at any circuit 
output, then there exists no test to detect x.7 for the sequence length 
specified. 

Clearly, this algorithm generates every test that will detect fault 
x.t at each output. Since it is probably necessary to detect the fault 
only once, the first valid test found usually terminates the process. 


4.2 The maximum-cover strategy 


The maximum-cover strategy has been quite successful. In most 
cases, it has detected from 85 to 100 percent of the faults in the circuit 
that are detectable with the maximum sequence length specified. For 
highly sequential circuits, a short-maximum-sequence length may 
detect few faults because the circuit cannot be exercised completely 
without using a long sequence of input vectors. 

The maximum-cover strategy operates on the fault-free equations 
derived for the circuit according to the maximum sequence length 
specified. The basic idea is simply to attempt to detect each primary 
circuit input fault at each circuit output. Factoring the output equa- 
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tions as before yields: 


Fo=A+B-aj + C-aj 


P=D+ E-dj + F-ay, (8) 


where A, B, C, D, E, and F are sum-of-product terms. This case 
attempts to detect the input faults on a at the output F. A test is 
formed in a manner similar to that used for detecting faults, except 
that it is necessary here to specify the value to be assigned to input 
lead a.7. 


Maximum-Cover Test = (#-C + F-B)-(a'j + a°y). 


This process is repeated until an attempt has been made to test each 
circuit input fault at each output lead. This scheme actually produces 
every test that satisfies the above equation. The shortest test (fewest 
input leads set to logical 0 or 1) is selected in each case. 

The time spent performing this computation is usually much less 
than that required to derive the equations. Also the time and results 
of the maximum-cover operation must be weighed against the cost of 
detecting additional faults on a one-at-a-time basis. Thus, while 
maximum cover is an expensive heuristic (when compared to, say, 
random-number test generation), it provides a set of tests that is 
usually good enough so that one can economically attack the remaining 
faults on a one-at-a-time basis. As a general rule, about 5 to 10 faults 
can be detected using the one-at-a-time strategies for the same cost 
as one pass of the maximum-cover strategy which inherently tries to 
detect all faults. 


V. EXPERIMENTAL RESULTS 


The final measure of an automatic test generation system is how 
well it does its job on real circuits. The ATG system has been pro- 
grammed and is being used at several locations in Bell Laboratories. 
The algorithms presented here are generally not useful for hand com- 
putation. The version of ATG used by Bell Laboratories on the IBM 
360, Model 67, collects certain data each time it runs successfully. 
The data collected include the execution CPU time, number of test 
vectors generated, number of faults detected, number of gates in the 
circuit, and number of flip-flops in the circuit. 

This implementation of ATG requires about 100,000 bytes for 
program storage. Other storage, used during execution, depends on the 
characteristics of the circuit being run. As the equations get longer, 


1498 3=THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1974 


the storage requirements increase. Generally speaking, ATG requires 
from one to five megabytes of virtual storage. This implementation 
allows only unit- and zero-gate delays, handles single stuck-at-one and 
stuck-at-zero faults, and generates fault-detection tests for single faults 
as well as the maximum-cover test-generation strategy. 

The data that have been collected indicate that ATG has been 
primarily used to generate tests via the maximum-cover strategy. In a 
few uses of ATG, the user attempted to detect only specified faults; 
these data are not included in this paper. 

The data collected represent only successful ATG runs. If the 
same circuit was run several times, then only the run that produced 
the fewest undetected faults (e.g., used the longest sequence length) 
is included. This is consistent with the recommended operational 
procedure, which starts with a short sequence length and increases it 
until an acceptable level of fault detection is reached. Faults in unused 
gates are included both in the undetected faults and in the total 
number of faults in the circuit. 

The results of 300 ATG runs on 120 circuits using the maximum- 
cover strategy are summarized in Figs. 11 through 15. The average 
circuit contained about 270 gates including about 10 flip-flops in the 
sequential circuits. Thirty-two circuits were combinational. ATG 
produced an average of 94 vectors in an average of 43 seconds of 
IBM 360, Model 67, CPU time resulting in an average detection level 
of 88 percent of the total number of faults in the circuit. However, 
the median percentage of undetected faults was only 7 to 8 percent. 
The longest sequence length used for these circuits was 5. Unfortu- 
nately, there is almost no correlation between the five parameters 
plotted in Figs. 11 through 15. The data correlate only in the extreme 
cases. For example, the circuit with 32 flip-flops produced a large 
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Fig. 11—Distribution of number of test vectors per circuit. 
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Fig. 12—Distribution of percentage of undetected faults per circuit. 
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Fig. 13—Distribution of number of gates per circuit. 
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Fig. 15—Distribution of CPU time per circuit. 


percentage of undetected faults. For most of the data, none of the 
parameters correlates significantly. 

ATG did not produce acceptable results on all circuits. In general, 
ATG is limited by the length of the equations generated. As these 
equations become long, the execution time increases and ATG may 
not terminate successfully due to excessive run time and/or storage 
requirements. The equations can become long as a result of long 
sequence lengths (e.g., shift registers and counters) or as a result of 
the function of the logic circuit (e.g., parity trees and adders). In 
addition, circuits such as parity trees produce quite long equations 
and ATG generates more vectors than the minimum required. 

One circuit recently run on ATG using the maximum cover strategy 
is worthy of special mention. The circuit is a 1000-gate, 11-state 
sequencer plus input, output, and transition logic. The sequencer 
state, represented by four D-flip-flops, can be read and written from 
circuit outputs and inputs respectively. Extensive use is made of the 
system clock to control transitions and gating. The clock waveforms 
were supplied to ATG by the user. ATG, using the clock and sequences 
of length one, generated 770 vectors in about 800 seconds, detecting 
about 95 percent of the faults in the circuit. The success of ATG here 
is partially due to the “easily testable design” which allows the 
sequencer state to be directly read and written. 

In practice, while ATG will not efficiently handle all circuits, it 
appears to be an economical tool for automatic test generation for 
“mildly” sequential circuits containing around 500 gates. The design 
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of circuits in an ‘easily testable’? manner greatly eases the work 
required to automatically generate test vectors for the circuit. 


VI. SUMMARY AND CONCLUSION 


In summary, the method used to generate tests is as follows: 


(t) Set the maximum sequence length k = 1. 
(it) Generate equations for the logic circuit with sequence length k. 
(272) Generate tests using maximum-cover strategy. 
(wv) Simulate the tests. If the percentage of undetected faults is 
less than, say, 10 percent, proceed to step (v). Otherwise, set 
k = k + 1 and return to step (72). 
(v) Generate tests for remaining undetected faults (one fault at 
a time) detectable with sequence length k. 
(vt) If an acceptable percentage of undetected faults remains, stop. 
Otherwise, set k = k + 1 and return to step (v). 


In practice, most users of ATG have been satisfied with the ATG 
results without trying steps (v) or (v2). 

The algorithms treat a logic network as an interconnection of gates 
which are assigned some fixed time delay. The technique generates 
two equations, F°(t) and F'(t), for each gate in the circuit. These 
equations denote the input conditions required to set gate F to logical 
0 and 1 respectively at time ¢. Because the technique starts from the 
circuit inputs and proceeds forward through the circuit (like the 
signal flow), it is not necessary to identify feedback leads. Therefore, 
both combinational and sequential circuits can be handled by the 
same algorithm. 

The primary difference between combinational and _ sequential- 
circuit test generation is that several input vectors may be required 
in a sequential circuit to set the desired state, detect some fault, and 
then propagate the fault to some output lead. The number of input 
vectors required to perform some test on the circuit is called the 
sequence length of the test. A sequence length of one is sufficient to 
generate all tests for a combinational circuit since it has no memory. 
The maximum sequence length to be considered is supplied by the user. 

The test-generation algorithms first generate the equations for the 
circuit, taking into consideration the gate delays and the maximum 
sequence length specified. These equations also take into account the 
effect of various single stuck-at-one, stuck-at-zero, or open-gate input 
faults when tests are being generated for specific faults. Then, from 
these equations, the algorithms will generate a test for any of the 
above faults if such a test exists within the sequence length specified. 


1502) THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1974 


Equations may also be generated that represent only the fault-free 
circuit. It is then possible to generate tests from these equations which 
exercise the circuit in such a way that many faults are detected. 
This has been a popular feature because it produces good results 
economically. 

These algorithms have been implemented and are currently being 
used to generate tests for circuits containing around 500 gates. Quite 
good results have been produced using the maximum-cover technique. 
A median of 7 to 8 percent undetected stuck-at faults was reached in 
less than 1 minute of IBM 360, Model 67, CPU time on a sample of 
some 120 circuits. Because of the success of the maximum-cover 
techniques, very little use has been made of the “single—fault” 
techniques. 

In conclusion, ATG is a production system that has been found to be 
a valuable tool for the generation of circuit pack tests. 
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Controllability, Observability, and Maintenance 
Engineering Technique (COMET) 


By H. Y. CHANG and G. W. HEIMBIGNER 
(Manuscript received February 28, 1974) 


A new technique has been developed for organizing (or reorganizing) 
system design to enhance fault diagnosability. This technique ts called the 
controllability, observability, and maintenance engineering technique, or 
COMET. Using graph-theoretical analysis, one can systematically apply 
COMET to a proposed or an existing digital system to determine the 
placement of control, access, and monitor points for diagnostic testing. In 
addition, 1t provides a means of studying the trade-offs between fault 
resolvability and the cost of maintenance hardware and/or packaging. 

COMET offers an orderly approach to implementing an overall diag- 
nostic design by providing guidelines in early design stages. A design 
developed using COMET has the following advantages: trouble location 
manual data can be generated without the use of fault simulation, multiple 
faults and/or nonclassical faults are locatable if they are detectable, and 
diagnostic or trouble-location information can be easily updated in accord- 
ance with hardware changes. Studies indicate that applying COMET to an 
existing processor design would require a modest increase in hardware of 
less than 10 percent. 


1. INTRODUCTION 


Recent advances in integrated-circuit technology offer the cir- 
cuit and system designers many opportunities to explore new, low- 
cost, high-performance design techniques. The increased operational 
speed and the logic complexity of many medium-scale-integration 
(MSI) and large-scale-integration (LSI) designs, however, also pre- 
sent acute problems in factory testing and field maintenance. For 
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factory testing, it becomes increasingly difficult to diagnose faults in 
an LSI package, partly owing to equipment packaging constraints and 
partly to inadequate fault isolation technique(s) for nonclassical 
and/or multiple faults. In field maintenance, while the isolation of 
faults to a component or chip level is unimportant, the problem of 
quickly, and automatically, detecting and recovering from faults is 
further compounded by increases in circuit size and complexity. In 
addition, the cost of using fault-simulation techniques to generate 
data for the trouble-location manual! (TLM) may become economically 
prohibitive, especially for large systems. Finally, the problem of ac- 
curately updating trouble-location data whenever circuit or design 
changes occur remains important but unresolved. 

Many designers of fault-tolerant computer systems have studied 
these problems.? Some have proposed design approaches with built-in 
automatic-fault-detection hardware.?? Others have explored diag- 
nosable design concepts purely from a structural standpoint based on 
graph-theoretical techniques.*:> Unfortunately, the search for practical 
methods of generating (and updating) TLM data for large systems 
has been largely unsuccessful. 

This paper describes a technique called COMET (controllability, 
observability, and maintenance engineering technique) for organized 
system design and system reorganization to enhance diagnosability. 
COMET enables a designer to systematically establish the diagnos- 
ability of a system by combining circuit design, physical arrangement, 
and maintainability considerations. It also offers an efficient and prac- 
tical way to generate TLM data. 

In Section II, the concept of COMET is described, followed by a 
detailed discussion of the technique and its relation to fault location. 
Possible methods of implementation are then discussed, along with the 
results of applying this technique to a small self-checking processor.® 
Lastly, the long-term impact of COMET on system design is pointed 
out. 


Il. DESCRIPTION OF CONCEPT AND TECHNIQUES 
2.1 Philosophy and characteristics 


The design of a fault-location procedure involves several steps. For 
a given processor or circuit, a set of tests capable of detecting all the 
assumed faults is first derived. The usual assumptions are that faults 
are solid and are of the ‘“‘stuck-at”’ type. This step is called the test- 
derivation phase. These tests are then verified either by sample fault 
simulation or by a complete fault simulation to determine if they are 
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indeed a good set of tests. The next step is to derive for each fault in 
the fault set the corresponding test results. This is done by simulating 
each fault with respect to the tests that are designed to detect this 
fault. The test results obtained by this technique are then processed 
to form a TLM. These two steps are called TLM data generation and 
data processing. 

Suppose the processor has only one circuit pack. Every time a fault 
occurs and is detected, this circuit pack is replaced. It will no longer 
be necessary to distinguish faults in this processor; the fault-location 
problem is eliminated and the TLM data-generation and processing 
steps disappear. The only step required is to derive the test capable 
of detecting all faults and to record the test results of the “good 
machine.” 

Now if the processor is composed of more than one circuit pack, the 
following conceptual approach may be used. At the beginning of 
diagnosis, half of the processor is disabled. This can be done, for ex- 
ample, by physically removing half of the circuit packs in a processor. 
Diagnostic tests are run only on the enabled portion and only the pass- 
or-fail data of the tests are recorded. Thus, if -a fault exists in the en- 
abled portion of the processor, the test result will give a failure indica- 
tion, meaning that the fault is not in the disabled portion. However, 
if the test result gives a pass indication, this means that the fault is in 
the portion that has been disabled or removed. 

Based on the pass-or-fail indication, one can further partition the 
enabled portion (in the case where the fault is in the portion that re- 
mained enabled during testing), or the disabled portion (in the case 
where the fault is in the disabled portion) to allow further testing. A 
general flow diagram of this procedure is shown in Fig. 1. Disabling 
means that the circuit packs associated with the disabled portion are 


DISABLE HALF 
OF PROCESSOR 








TEST THE ENABLED PORTION 


PASS FAIL 
ENABLE HALF OF DISABLE HALF OF 
PREVIOUSLY DISABLED ENABLED PORTION 
PORTION 





Fig. 1—Flow diagram of disabling process. 
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either physically disconnected from the processor or are logically in a 
passive state. 

Figure 2 shows an example of how faults are located using this tech- 
nique in a processor composed of four circuit packs (A, B, C, and D). 
Assume that the fault is in circuit pack B. Diagnostic tests are first 
performed with circuit packs C' and D disabled or removed. The tests, 
which are designed to detect all the faults in packs A and B, are run 
and a failure is indicated. It can then be concluded that some failure 
exists in either pack A or B. Next, the diagnostic tests are run on cir- 
cuit pack A with circuit pack B disabled. This time, however, because 
the fault (which is in circuit pack B) has been masked, the test result 
will show a pass indication. It can then be concluded that a fault is in 
circuit pack B because it gives a fail and then a pass signature. In other 
words, by successively reducing the circuitry under test and by only 
recording the fail/pass results in each step, the locations of all faulty 
circuit packs are uniquely identified. 

It is quite apparent that the basic difference between this technique 
and the conventional approach is that in the latter one must, for isola- 
tion purposes, distinguish the faults not only from the good machine 
but also from every faulty machine. In other words, the important 
consideration is where the fault is. In the proposed approach, however, 
we are not required to distinguish the various faulty machines; we are 
only interested in whether the test result from the good machine differs 
from that of the faulty machine. Resolution is obtained by successively 
reducing the circuitry under test. 

For each element of the partition (e.g., packs A and B of the first 
partition in Fig. 2, or circuit pack A of the second partition in Fig. 2), 





} 
7 
Y 
7 X 


Fig. 2—Proposed method of fault location. 
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only a pass or fail indication is required. This means that the test result 
of the good machine for each partition is all that is required; there is 
no need to simulate faults to generate sufficient information for the 
distinguishability of the faults within the element of partition. It can 
be seen that each faulty circuit pack is identifiable by a unique pattern 
of pass-and-fail indications; the pass-and-fail numbers are in a one- 
to-one correspondence to the circuit packs in the processor. This 
means that the number of trouble numbers in the TLM is drastically 
reduced. 

Another characteristic of the proposed technique is that multiple 
faults on a circuit pack are locatable as long as they are detectable by 
the applied tests. This should improve accuracy since the requirement 
for consistency of test signatures for TLM lookup no longer exists. 


2.2 Description of techniques 
2.2.1 Controllability and observability 


There are some problems that must be solved. First, one must design 
a disabling process that allows circuit packs to be selectively disabled 
or removed from the processor. Second, controllability of the various 
circuit packs or functional blocks must be established. For example, 
as shown in Fig. 3a, gate G’ must be operational to test gate G. That 
is, G’ must not be disabled when G is being tested. If, however, G’ is 
in the portion that has been disabled, the testing of G and therefore 
the test results become meaningless because G is not controllable 
from G’. 

Similarly, a proper ordering of the various circuit packs or func- 
tional blocks in relation to observability must be derived. As shown in 
Fig. 3b, to observe the test results of a fault (marked XX) associated 
with gate G, the output of G must not be in the logic block that has 
been disabled. Otherwise, the test results will show an all-tests-pass 


\\ 





(b) 
Fig. 3—(a) Controllability-ordering relations. (b) Observability-ordering relations. 
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outcome even though the fault is present on gate G. Thus, the necessary 
condition for this technique is the establishment of an ordering relation 
of controllability and observability such that a partitioning procedure 
can be used for fault isolation. If the elements of partition 7 do not 
depend on elements of partition 7 (where j > 72) for controllability 
and observability, one can successively apply the partitioning and test- 
ing process to locate the faulty circuit pack, providing that the dis- 
abling technique of circuit packs is available. 


2.2.2 Logic-disabling technique 


There are a number of disabling techniques possible. The first and 
the most obvious one is to physically remove the circuit pack(s) from 
the processor. This is only feasible if a reliable connector is available 
and the number of circuit packs involved is small. An alternative is to 
physically disconnect the input and output leads of a circuit pack by 
some mechanical device. A third alternative in some cases is simply to 
remove the power and ground leads of a circuit pack, thus putting the 
circuit pack in the passive state. A fourth alternative is the logic-dis- 
abling technique illustrated in Fig. 4. 

If control lead C; (Fig. 4) goes to 0, it forces output of the output 
gates to logical value 1, which for NAND logic is the passive state. After 
the circuit pack has been disabled, error symptoms caused by any fault 
or faults in the circuit pack (marked X in the illustration) cannot 
propagate beyond the circuit-pack outputs.* Thus, if each circuit 
pack 7 is modified by adding disable control lead C;, each circuit pack 
in the processor can be enabled or disabled selectively. 


2.2.3 Partitioning techniques 


Once a practical way of disabling circuit packs is obtained, the next 
step is to devise a technique of ordering the circuit packs or the func- 
tional blocks based on the observability and controllability relations. 
The controllability and observability relations can best be understood 
by the example shown in Fig. 5. 

We define a functional node as a functionally well-defined logic cir- 
cuit, such as a rotate circuit, an adder, etc. In some instances, a func- 
tional node is also defined as a logically or physically related block of 
circuitry, such as bits 0 through 7 of the X, Y, Z registers. To test a 
functional node, one must apply signals via control inputs and observe 


* Note that the stuck-at-0 faults on the disabled gates can still propagate. Treat- 
ment of these faults is discussed in Section 2.2.4. 
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Fig. 4—Circuit-pack logic-disabling process. 


test results via some observable outputs. Thus, in Fig. 5, it can be con- 
cluded that to test functional node R, the inputs are obtained from 
functional node A, and control is obtained from functional node C. 
Functional nodes A and C, therefore, must be operational when func- 
tional node RF is being tested. Similarly, to observe the test results of 
R, functional nodes O; and Oz are used. In other words, functional 
nodes A and C control R, and functional nodes O; and Oy, observe R; 
they must not be in the portion that is disabled when node B£ is being 
tested. 


CONTROL AND ACCESS 
RELATIONS: 


A= >R 
c= >R 


OBSERVABILITY 
RELATIONS: 
0, —>R 
0.-—>R 





Fig. 5—Controllability and observability relations. 
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A set of relations can now be defined as follows: node A controls node 
B, if node B requires control from A to be fully testable. This relation 
is expressed by A => B. Similarly, node A observes node B, if node 
A is required to observe the test results of node B. This is represented 
by A > B. 

From a set of properly defined functional nodes, the controllability 
and the observability relations among neighboring nodes can be ob- 
tained. These relations are conveniently represented by a directed 
graph where the nodes of the graph correspond to the functional 
nodes and the edges of the graph correspond to the controllability 
and/or the observability relations. If the resultant graph is loop- 
free, all the nodes can be arranged in a partially ordered list so that 
the higher-order* nodes can be fully tested independently of any 
lower-order node(s). This guarantees a relation of controllability 
and observability among all the functional nodes such that we are able 
to partition the system. However, if the resultant graph is not loop- 
free, a conflict exists from a controllability and observability viewpoint. 

For example, if some functional node F's controls functional node F, 
which in turn controls F';, and if F's observes F's, then a loop exists con- 
taining F's, F:, and F;. This loop must be “‘broken’’ to test these nodes. 
In this context, breaking loops means that additional control or ob- 
servation must be added in appropriate places to obtain a loop-free 
graph. 

The process of systematically ordering the nodes and identifying 
conflicts in the directed graph makes use of graph-theory techniques.* 
The directed graph of the controllability and observability charac- 
teristics of functional nodes can be represented by a connectivity 
matrix C = [c.;], where c;; = 1 if there is a directed edge from 7 to j. 
In other words, if node 7 controls node j, the entry c:; = 1 in the con- 
nectivity matrix for the controllability relation. Similarly, if node 
k observes node p, the entry czy = 1 in the connectivity matrix for 
observability. 

A node 7 is reachable (controllable or observable) from node 7 if and 
only if there is at least one directed path from to j. A graph is strongly 
connected if and only if every node is reachable from any other node. 
This means that in a strongly connected graph, every node is in at 
least one loop. A maximal strongly connected (MSC) subgraph is one 
that includes all possible nodes that are strongly connected with each 


* The term higher refers to the location in a diagnostic procedure. The higher nodes 
are verified first. 
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other. This designates a maximum set of functional nodes that are in 
conflict from a controllability and observability viewpoint. A link 
subgraph of a graph is one that contains no strongly connected sub- 
graphs or unconnected subgraphs in it. The link subgraph is loop free 
and therefore all the nodes in it can be arranged in a partially ordered 
list. 

Generally speaking, all functional nodes are either in some MSC 
subgraph or in some link graph from the observability and control- 
lability viewpoint. The objectives are, therefore, to locate the MSCs 
(i.e., areas of conflicts) in the directed graph and to add additional 
controllability or observability points in order to break the MSCs and 
arrive at a partial ordering of the nodes. The following is an algorithm 
for performing this function. 


(7) Construct connectivity matrix of the functional nodes of the 
processor with respect to the controllability and observability 
relations. 

(it) Locate all MSCs and represent them as pseudonodes. 

(212) Establish the order of the nodes in the new directed graph. 
(iv) Any MSCs left? If yes, go to step (v). If no, exit. 

(v) Pick an MSC of the highest order and apply MSC breaking 

technique; then return to step (777). 


The connectivity matrix is arrived at by merging the connectivity 
matrices of the controllability relations and observability relations. 
All MSCs found in the directed graph are located and temporarily 
identified as pseudonodes so that the ordering process of step (777) can 
be carried out. Ordering means that all nodes having only primary 
inputs are considered to be of the highest order (i.e., first order), and a 
node is of 7th order if all of its inputs are of order 7 — 1 or less and at 
least one of the inputs is of order 7 — 1. In an ordered list, nodes of the 
ith order do not depend on those of jth order, for 7 2 2 for control- 
lability and observability. The kth order is said to be higher than the 
ith order if k < 2. Once the ordering process is performed, we then 
proceed to break the MSCs, if necessary.* The process of breaking 
MSCs is similar to the one described by Ramamoorthy.’ First, the 
entry nodes of a given MSC are identified. An entry node having the 
highest ratio of number of incoming edges to number of outgoing edges 
is selected. All edges entering it are deleted; this means that additional 


* This is necessary only if the required resolvability is one faulty circuit pack and 
there exist some MSCs that cannot be packaged on one circuit pack. 
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control and/or monitor points are added to those nodes associated with 
these edges. 

The result of this process [steps (z) through (v)] is a loop-free 
directed graph or a partially ordered list of nodes. Nodes of the 7th 
order are completely testable using only those nodes of orderings 
higher than 7. At this point, packaging considerations can be in- 
corporated in order to arrive at a reasonable set of partially ordered 
lists of circuit packs. The general guide-lines for packaging are: 


(t) Group only nodes of the same ordering on one circuit pack or 
set of circuit packs. 

(iz) If this is not possible, then group a node (or a set of nodes) of 
the 7th order with those of order 7 — 1 or i + 1, but not both. 


These guidelines assume that the resolution is to be to one circuit 
pack. For example, the groupings of functional nodes shown in Figs. 
6a and 6b are acceptable. In Fig. 6b, the ordering of circuit packs P; 
and P;,1 are irrelevant because the two circuit packs are equivalent. 
But the ordering of circuit packs P;-1 and P; or P;-1 and Pj4: is im- 


Li- 4 Lj Lit 

D A E 
Cc 

G B F 





Fig. 6—Packaging considerations. 
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portant. This is because functional node G controls both nodes A and 
B, and therefore G must be of a higher ordering than nodes A or B. 
Figure 6c illustrates a conflict because G controls A but A controls F. 
Suppose functional nodes G, B, and F are packaged on one circuit pack. 
A conflict then exists because packs P;-1 and P; cannot be separated 
for testing purposes. In other words, the resolution has been degraded 
from one pack to two packs in this case. 

Once the nodes have been packaged and all the packaged circuits 
have been ordered, a partitioning procedure can be applied to the 
packaged circuit packs. The following example shows the process of 
ordering and partitioning of nodes. 


Example: Suppose the directed graph shown in Fig. 7a represents 
the controllability and the observability relations of a circuit. Each 
node in the graph represents a functional entity; each edge repre- 
sents either a controllability relation (denoted by =) between the 
two nodes or an observability relation (denoted by —) between the 
two nodes. For example, node D observes node H#; node £ controls 
node H. The information represented by this graph is equivalent to a 
connectivity matrix which can be constructed by examining each func- 
tional node in the circuit and its observability and controllability 
relations with its neighboring functional nodes. 

To arrive at a partially ordered list of nodes, the first step is to locate 
all MSCs and represent them as pseudonodes. In this case, there is one 
MSC (as indicated by the dotted line in Fig. 7a) denoted by v. The 
reduced graph is then ordered by applying the ordering process. For 
example, to completely test and observe node C which is of order Lz, 
it is only necessary that nodes of higher order, i.e., nodes A and B of 
order Li, be available for control and observation. 

If all the nodes can be packaged at this point according to the 
previous guidelines, a partial ordering of nodes has been obtained. 
However, if, for example, the pseudonode v contains too many functions 
to be packaged, the pseudonode v must be further decomposed by 
breaking the MSC it represents. The entry nodes to this MSC are 
nodes D and G. Node G is chosen and the edge D-G is broken by adding 
control to node G. At this point there is still another MSC in the 
graph so the process is repeated (see Fig. 7b). 

A new MSC denoted by a pseudonode v is identified, and the graph 
is ordered once again. The MSC denoted by v1 is again ‘‘broken” after 
having decided to add an observable point to the functional node D. 
The final ordered list of functional nodes is shown in Fig. 7c; these 
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Fig. 7—Partitioning and ordering of functional nodes. 
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nodes form a partially ordered list in that nodes of order L; are 
testable using only those nodes of an ordering higher than Li. 

To demonstrate how fault(s) can be isolated using this process, the 
functional nodes are assumed to be packaged as shown in Fig. 7d. 
Without loss of generality, a binary partitioning process will be used 
in the following cases. 

Case 1: Single-fault location. Suppose a fault occurs on circuit pack 
P,;. Diagnosis will be performed first on portions of circuitry consisting 
of P1, P2, P3, and Ps, while circuit packs P;, Ps, and P7 are disabled. 
The test results are valid because the testing of circuit packs P: through 
P, does not depend on circuitry associated with circuit packs P; 
through P; due to the proper partial ordering requirement. The first 
test result (71) will yield a pass indication implying that the fault is 
not on circuit packs P1, P2, P3, or Py. Following the process discussed 
in Section 2.1 (see Fig. 1), the diagnosis (72) is then performed on 
circuit packs P; and Pg, using circuitry associated with P; through P,, 
which has been previously verified, while disabling circuit pack Pz. 
This time the tests will show a fail indication; the failure is isolated 
to circuit packs P; and P.. In the next step (73), circuit packs P. and 
P, are disabled while running tests on circuit pack P; using circuitry 
associated with P, through P,. The test result again shows a fail 
indication and, thus, identifies the faulty pack to be Ps. 

Case 2: Multiple-fault location. Now suppose that a fault exists on 
circuit pack P; and another fault exists on circuit pack P;. The initial 
diagnosis of partition 71 shows that a failure is in circuit packs P; 
through P, with circuit packs P; through P; disabled. The failure 
symptom caused by the fault of Ps; will not interact with the testing of 
P; through P, because it has been disabled. 

Next, circuit packs P, and P; are tested in partition 72; this test indi- 
cates a pass indication. The final test (73) is on P; using circuits associ- 
ated with P; and Pe, and disabling circuit packs Ps through P;. This 
test identifies the fault on P;. Once this faulty circuit pack is identified 
and replaced, a complete check is run. The presence of the fault on P; 
will now indicate a test failure. The diagnostic process described pre- 
viously is repeated again to isolate and identify the second fault. 

It can be seen that this process enables us to systematically isolate 
faults one at a time until all faults in the circuits are identified and re- 
paired. Any fault that gives a test result that is different (regardless of © 
the nature of the fault) from the true-value signature is detectable. In 
other words, the single-fault assumption and the classical stuck-at-0 
and stuck-at-1 assumptions on failure modes are no longer necessary 
with this approach. 
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2.2.4 Global Feedback 


The constraint of being able to fully disable the outputs of all circuit 
packs is an overly restrictive condition. The only reason for disabling 
is to prevent propagation of fault symptoms into the portion of the 
machine under test. Thus, at any partition only those leads crossing 
from the lower-order levels into the higher-order levels are of major 
importance. In a general case, this should represent only about half 
of the leads crossing the boundary. In a machine organized using 
COMET, it could be expected that the portion of leads crossing from 
low levels to high levels would be less than half, reflecting the attempt 
COMET makes to break things into a tree structure. 

For the purpose of discussion, ‘global feedback”’ will be defined on 
the linear ordering of functional nodes. A global feedback is any 
directed wire going from a lower-order node to a higher-order node in 
the list of partially ordered functional nodes shown in Fig. 8. The dis- 
tinction between a wire and a controllability/observability edge is 
important. Optimization of the latter (i.e., control by gate A) may re- 
move a connectivity matrix entry while the wire still remains. It can be 
seen that it is only necessary to be able to disable all leads that fit the 
definition of global feedbacks. This is a considerably less stringent 
requirement than being able to disable all circuit-pack outputs. 

The concept of logic disabling is a method of emulating the physical 
removal of circuit packs (i.e., leaving circuit-pack-output gates in the 
all 1’s state). However, if the stuck-at-0 output of the gate is disabled, 
it presents a potential problem. This can be analyzed as follows. First, 
the stuck-at-0 output may feed to a higher-order node. If the ordering 
has been carefully observed, there must be an alternate method of con- 
trolling the lead. In this case the highest circuit pack fed by the stuck- 
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Fig. 8—Example of global feedback. 
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at-0 output is identified as bad. If ordering has not been carefully 
observed and no special control has been added, the result is unpre- 
dictable. If the stuck-at-O0 output feeds only lower nodes and control 
or observation has not been added, the highest of the lower nodes are 
identified as bad. If observation has been added, the fault should be 
properly isolated. In the cases where the ordering has been observed, 
the pack identified as bad is either the proper one or is fed by the stuck- 
at-0 output. A simple check is to remove the circuit pack identified by 
diagnosis. Next, all packs feeding the removed circuit pack are dis- 
abled. At this point, all connector pins should be logical 1’s. A test 
connector can then be inserted into the vacant slot and examined. 
Assuming that the basic connectivity information is available on a per- 
pack basis, the actual faulty pack is now identified by correlating any 
grounded pins with a faulty pack. 

The test procedure is highly automatic. Diagnosis proceeds to locate 
a suspect pack. The craftsman replaces this pack with the test con- 
nector. The machine disables the necessary packs, scans the connector, 
and locates any grounds. The location of the grounds can be combined 
with connectivity information stored on bulk storage to uniquely 
identify the bad circuit pack. This procedure is much less susceptible 
to manual errors than previous diagnostic techniques. 


Il, FEASIBILITY STUDY—APPLICATION OF COMET TO A SMALL PROCESSOR 


To verify the feasibility of the COMET procedure, a small self- 
checking processor was selected for study. This choice was made be- 
cause a simulation model of the processor existed. This allows verifica- 
tion by simulating the ability of COMET to locate faults. In addition, 
the processor is complex enough to present a good sampling of ‘‘real- 
life’’ problems. 


3.1 Brief description of processor 


The processor is a stored program machine,’ composed of approxi- 
mately 4,400 logic gates. It features a microprogram control and a 
general-register structure. It is fully self-checking and does not rely 
on matching for fault detection. From a system view, the active and 
standby processors are linked to the outside world by the local main- 
tenance center.* The LMC is responsible for the diagnosis of an off-line 
central control. In terms of COMET, control of the disabling and actual 
testing is exercised by the LMC. 

The interface between the processor and the LMC is detailed in 
Fig. 9. The major ports for controlling the processor are the input bus, 
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Fig. 9—Interface between processor and local-maintenance center. 


the microprogram data register (MPDR), the clock, and the various 
stores. The ports prescribe external controllability and observability 
for the processor. The internal structure of the processor is a register- 
bus structure with 16 addressable registers. 

The control of the machine centers around a microprogram unit. 
This unit is composed of a 36-bit data register and two major de- 
coders, the “‘to” and “‘from’’ decoders. It is highly self-checking as is 
the entire machine. Nearly any hardware error causes immediate 
notification of the LMC via the status register.® 
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3.2 Formation of the connectivity matrix 


The first step in deriving the processor connectivity matrix is to 
define a reasonable set of functional nodes. In this case, the functional 
node definitions parallel the processor block diagram quite closely. 
There are a total of 41 functional nodes under consideration. These 
vary considerably in size. The LMC, for instance, is nearly as large as 
an entire processor while the decision logic is only a few gates. 

With a preliminary set of functional nodes defined, the control- 
lability and observability relations can be derived. These relations 
are derived on a local (i.e., node-by-node) basis. In other words, 
the set of relationships for a single node can be written by only 
considering its neighboring nodes. In general, these relationships bear 
a close association to the physical connections in a circuit. In fact, in 
the limit, every physical connection between two nodes could generate 
both a control and an observation edge. 

The mechanical construction of the connectivity matrix by only 
physical connection information will yield a sufficient set of conditions 
for leveling. This will be equivalent to doing a bit-by-bit OR of the 
control connectivity matrix with its transpose to formulate the com- 
bined control and observation matrix. However, in practice, simplifica- 
tion can often be achieved without resorting to the seemingly brute- 
force approach. 

The method of simplification is based on two logic features not 
properly represented by the graph-theory model. The first and most 
important feature is the existence of fanout. A gate from a register 
may fan out to several points, as shown in Fig. 10a. The resultant 
directed graph for proper controllability and observability rela- 
tions was derived strictly based on physical connection information 
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Fig. 10—Fanout considerations in modeling. 
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Fig. 11—Collector-tie considerations. 


shown (see Fig. 10b). The interpretation is that all nodes B and C and 
D are required to observe node A. In fact, however, it is only necessary 
to have node B or C or D to observe node A. Rather than have entries 
in the observability matrix for all three, only one entry is needed. The 
determination of which entry to retain is usually not completely arbi- 
trary. In general, it is desirable to retain only the highest-order node 
(B, C, or D) for observation. 

The other logical feature that allows simplification is the collector 
tie (see Fig. 1la). This effect on the control-connectivity matrix is 
analogous to the effect of fanout on the observation-connectivity 
matrix. It is only necessary to be able to control one of the inputs to a 
collector tie to check its validity with respect to the gates feeding it 
(Fig. 11b).* 


3.3 Ordering and partitioning of nodes 


The 41 nodes may now be analyzed to arrive at a partial ordering. 
If they can be leveled at this point, an order is established. In general, 
this will not be the case. Analysis programs are then used to identify 
and locate controllability and observability MSCs. There are several 
ways to treat these problems. From a diagnostic point of view, the 
MSCs should be broken down into link-graph structures. This implies 
adding control or observation points and reanalyzing the graph. This 


*The resolution of a ground on a collector-tied node is a well-known classical 
problem. 
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approach will yield a circuit that is diagnosable to one functional node 
(which may be one circuit pack or part of a circuit pack). 

In practice, adding sufficient control or observation to break all 
MSCs may be expensive. In many cases the acceptance of reduced 
resolution is more attractive than the addition of much hardware. For 
example, it may not be economical to resolve a circuit pack output 
stuck-at-1 fault and an input-diode-open fault of the pack it drives, 
even if these two packs form a two-node MSC. Whenever possible one 
should always attempt to put the connected nodes on a single package 
to reduce (and eliminate) MSCs. This is equivalent to admitting that 
faults in the nodes are indistinguishable. However, this is of no im- 
portance if they are detectable and on a single package. 

In attempting to partially order the processor nodes, several con- 
trollability and observability MSCs were discovered. The most obvious 
concern is the status register. The states of all check circuits in the 
machine are sampled and trapped in the status register. Any error 
indications cause an immediate maintenance interrupt of the processor. 
The LMC has the ability to directly read all bits of this register and 
to clear the register. However, it did not originally have controlled 
write access to the register. The situation that existed is shown in Fig. 
12. This can be written as follows: 


STATUS REGISTER — ‘‘TO”’ DECODER 1/N CHECK —> ‘‘TO”’ 
DECODER => STATUS REGISTER 


and indicates the presence of a loop. The flip-flops of the status 
register are not testable in a “start-small” diagnosis. The solution to 
this problem is rather simple. The LMC is given controlled write 
access to the status register, as shown in Tig. 12. This allows removal 
of the control link specifying: 


“to”? DECODER => STATUS REGISTER 
and replacing it with: 
LMC => STATUS REGISTER. 


In addition, there is a two-node loop between the MPDR and the 
microprogram store (MPS). This is solved by careful merging of the 
LMC access to the MPDR. The collector-tie access will restrict fault 
isolation to two packs (one in the LMC and one in the MPS) in the 
stuck-at-O0 case but will allow resolution of the stuck-at-1 problems 
to one pack. The decision logic and MPDR are also connected by a 
loop. The decision logic controls bit 0 of the MPDR and is observed 
by the MPDR. The stuck-at-0 on the decision logic output would only 
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Fig. 12—Status register and its local-maintenance-center interface. 


be resolvable to two packs. The stuck-at-1 output is not distinguishable 
from the set-bit-0 input open. The provision of a single control-write 
lead from the LMC eliminates the latter resolution problem. 

Another loop exists between the decoders and the 1/N check 
circuits. Again, either two-pack resolution must be accepted or the 
two nodes must be packaged together if additional hardware is not 
provided. In this case, the decoder functions can be halved into an 
even parity and an odd parity and part of the check circuitry packaged 
with each half. The tradeoff again is one of engineering judgment and 
cost effectiveness. 

The process of ordering the processor nodes can be completed by 
consciously resolving any conflict that appears. The term ‘‘con- 
sciously” is a key point. Assuming that the control and observation 
relations are complete, the analysis procedure will point out diagnostic 
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problems. To proceed to a fully ordered graph, these problems must be 
examined and treated. Thus, the existence of a fully ordered graph 
should insure consideration of diagnostic problems. A by-product of 
the ordering process will be the addition of maintenance hardware or 
a generated list of graph edges having some degree-of-resolution 
problems. The nature of this list of edges is such that it should give 
a reasonably good measure of the diagnostic resolution possible with 
the proposed design. 

The final ordering of the functional nodes (Fig. 18) requires seven 
levels to encompass the 41 functional nodes. In this case there are 
fewer circuit packs than nodes.’ This indicates that more than one 
node will be packaged on a circuit pack, as expected. The diagram 
gives some guidelines as to which nodes can conveniently be packaged 
together and which should not be packaged together. 

From the ordered control/observation graph, a diagnosis strategy 
can be derived. Here a linear-partitioning process is used. In a linear- 
partitioning process, all circuit packs but one of the highest order are 
first disabled. Tests are run on this circuit pack and, if a failure is 
detected, the fault is in the circuit pack. If the failure is not detected, 
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Fig. 13—Partial ordering of processor nodes. 
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further disabling is in order. In a second partition circuit, packs 1 and 
2 (of order 1 and 2) are enabled, whereas the rest of the circuit packs 
are disabled. Tests are then run on circuit packs 1 and 2. If they pass 
the tests, then the fault(s) would be in the rest of the disabled propor- 
tion; if they fail the tests, then the faulty circuit pack is in the portion 
of circuit under test. Since circuit pack 1 was verified to be good in 
the previous partition the faulty circuit would be circuit pack 2. This 
process is repeated for each partition in a linear fashion until the 
faulty pack is located. 

In this example, the diagnostic strategy is to verify the first five 
levels in order. The power (Li) is verified from the LMC. Next, the 
status register (Le) is verified by controlled write and read access. 
With the status register now available as an observation port, the 
clock (L3) can be verified. Next, controlled read and write access are 
used to verify the MPDR and the decoder check circuits (L4). The 
MPDR and the status register provide sufficient control and observa- 
tion to diagnose the MPDR check circuits (L;). Cycling through the 
MPS and observing the results with the MPDR will verify the micro- 
store (Ls). The decoders (L;) are now checked using the MPDR and 
clock as inputs and the status register as outputs. Note that the 
combination of the previously discussed packaging reasons could 
obviously modify Ls; and Le. The buses (Ls) are verified by controlled 
read and write access. This leaves a skeleton processor capable of 
executing microinstructions. The nodes of Ls and L7 must be arbi- 
trarily converted into a form adaptable to linear enabling and diag- 
nosis; an example is shown in Fig. 14. It was largely these functional 
nodes that were used for the simulation experiments. 


3.4 Fault location using COMET 


To verify the results that are expected from COMET, several 
simulation experiments were performed. A 2700-gate* simulation 
model of the processor was used. The model, which excluded the 
various stores, the peripheral communication and the LMC interfaces, 
provides a good vehicle for checking the COMET technique. 

The procedure for checking the control section of the machine is 
quite straightforward with the aid of the LMC. Thus, simulation was 
performed assuming that the microcontrol section had been tested and 
found good. 


*The major difference in gate count between the simulation model (2700 gates) 
and the entire processor (4400 gates) is due to the inclusion of only two of the general- 
purpose registers in the simulation model. 
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Fig. 14—Functional nodes included in simulation experiments. 


Selected faults were inserted into the model and a simplified set of 
diagnostic tests were run. The functional node arrangement and the 
test (77) procedure are shown in Fig. 14 in which a binary-partitioning 
technique is used. In actual applications, this would be replaced with 
a linear-partitioning arrangement. 

The first experiment consisted of inserting a stuck-at-0 fault in 
bit 10 of the first level of the combinational data-rotation network.’ 
The sequence of tests and their results, as derived by simulation, are 
shown in Table I. Tests 71 and 710 passed, whereas 72 and 75 failed. 
The pass/fail number of 71 72 75 710 = (1001) uniquely points to a 
functional node which, in this case, is the rotation or ‘‘steer.”’ 

The second experiment was to insert a stuck-at-0 fault in the 
instruction address register (IAR). The results of running the test 
phases on this fault are shown in Table II. Again, a unique functional 
node is isolated by the pass/fail number 71 73 76 711 = (O101). 

For a third experiment, both the fault in the rotation logic and the 
fault in the JAR were inserted simultaneously. As expected, the JAR 
(i.e., the highest) fault was isolated independent of the other fault. 
Upon correction of the IAR fault, the steer fault would be isolated. If 
the replacement circuit pack that is to correct the IAR fault had been 
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Table |— Simulation results—steer network fault 


. Test Input Output Vector 

Machine | (gee Fig. 14)| Vector (Binary) 

Good ‘a #7 0000 0000 0000 0000 00 
Faulty nl #7 0000 0000 0000 0000 00 
Good a #12 Wu i ii un u 
Faulty a #12 11 Wt Wi ii 1 
Good xl #14 0000 0000 0001 0000 11 
Faulty aa #14 0000 0000 0001 0000 11 
Good al #16 1000 0000 0001 0000 10 
Faulty rt #16 1000 0000 0001 0000 10 
Good 72 # 8 0000 0000 [0000] 0000 00 
Faulty 72 # 8 0000 0000 [0010] 0000 00 
Good 72 #13 ili Wil ii iil 
Faulty 72 #13 1111 1111 1111 1111 11 
Good 7 # 6 0000 0000 [0000] 0000 00 
Faulty x5 # 6 0000 0000 [0010] 0000 00 
Good 75 #9 ili Wli Ti 111 11 
Faulty x8 #9 1111 1111 1111 1111 11 
Good 710 # 6 0000 0000 0000 0000 00 
Faulty x10 # 6 0000 0000 0000 0000 00 
Good 710 #9 1111 111 111 111 11 
Faulty 710 # 9 111 1111 1111 1111 11 


faulty itself, it would also have been isolated. This demonstrates the 
capability of isolating multiple faults. 

The final simulation experiment involved altering the basic IAR 
circuit to reflect a signal short between bits 4 and 14 of the register. 
The test set used has sufficient fault-detection capability to recognize 
this nonclassical fault. The COMET approach makes it possible to 
isolate the fault, as indicated by the simulation results shown in 
Table ITI. 

These experiments bear out the results expected for isolating single, 
multiple, and/or nonclassical faults. They also point out the dependence 
on a good set of tests. The signal short is an example of this dependence. 
Normally, a set of tests capable of detecting all stuck-at-1 or stuck-at-0 
faults might not detect the short. However, if the fault is detectable, 
it is isolated by using COMET. Considerations such as these will 
obviously have some impact on the diagnostic program design. 


IV. TRADE-OFFS 


Most of the discussion thus far has considered diagnostic resolution 
to a single circuit pack. If COMET dictated an all-or-nothing approach 
to the problem, its usefulness would be severely affected. The cost of 
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Table I] — Simulation results—instruction address register 
classical fault 





: Test Input Output Vector 
Machine (See Fig. 14) Vector (Binary) 
Good wl # 7 0000 |0000| 0000 0000 00 
Faulty wl # 7 0000 [1000} 0000 0000 00 
Good wl #12 1111 1111 1111 11 
Faulty wi #12 1111 TALL T111. «Th 
Good wl #14 0000 0001 0000 11 
Faulty wl #14 0000 0001 0000 i1 
Good iat #16 1000 0001 0000 10 
Faulty wl #16 1000 0001 0000 10 
Good 13 # 4 0000 0000 0000 0000 00 
Faulty 13 #4 0000 0000 0000 0000 00 
Good r3 # 5 W441 111) 1111 #1111 #11 
Faulty 73 # 5 1111) 1411 1111 1111 #11 
Good 13 #7 0000 0000 0001 0000 11 
Faulty 13 # 7 0000 0000 0001 0000 11 
Good 13 # 8 1000 0000 0001 0000 10 
Faulty 13 #8 1000 0000 0001 0000 10 
Good 16 # 6 0000 }0000) 0000 0000 00 
Faulty 16 # 6 0000 [1000] 0000 0000 00 
Good 76 #10 1111 1111 111 1111 11 
Faulty 16 #10 1111 1111 1111 1111 11 
Good wil # 5 0000 0000 0000 0000 00 
Faulty wil # 5 0000 0000 0000 0000 00 
Good mil # 8 1111 1111 1111 1111 11 
Faulty wll # 8 W111 1411 1111 1111 11 


removing all control/observation MSCs is in general quite high. There 
are a number of alternatives that can and should be evaluated. 

First, packaging must be considered as a trade-off parameter. It is 
possible that toleration of a slight reduction in packaging density 
could result in a significant reduction in the number of MSCs left to 
be broken. Procedures could be derived to evaluate the possible packag- 
ing trade-offs, but for now it appears to be a matter of engineering 
judgment. 

Second, the possibility of accepting reduced resolution must also be 
considered. A vast majority of the loops in the controllability and 
observability connectivity matrix contain leads going from one pack 
to another. An example of those leads is shown in Fig. 15a. The wire 
marked A will generate a control relation and an observation relation, 
as shown in Fig. 15b. COMET analysis of this situation will reveal 
that to distinguish between an output stuck-at-1 fault (on gate X) 
and an input-diode-open fault (on gate Y), one must add control or 
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Table IIl— Simulation results—instruction address register 
nonclassical fault 


, Test Input Output Vector 
Machine (See Fig. 14) Vector (Binary) 
Good al # 7 0000 |0000) 1111 1111 11 
Faulty al # 7 0000 [1000] 1111 1111 11 
Good wi #12 11411 1111 0000 0000] 11 
Faulty wi #12 1111 1111 0000 [0010] 11 
Good al #15 0000 0000 0001 0000 11 
Faulty al #15 0000 0000 0001 O000 11 
Good wl #19 1000 0000 0001 0000 10 
Faulty al #19 1000 0000 0001 0000 10 
Good 13 # 4 0000 0000 1111 1111 11 
Faulty 13 #4 0000 0000 1111 1111 11 
Good 13 # 5 1111 1111 0000 0000 11 
Faulty 13 # 5 1111 1111 0000 0000 11 
Good 3 # 8 0000 0000 0001 0000 11 
Faulty 13 # 8 0000 0000 0001 0000 11 
Good 13 #10 1000 0000 0001 0000 10 
Faulty 13 #10 1000 0000 0001 0000 10 
Good 16 # 6 0000 |0000| 1111 1111 11 
Faulty 16 # 6 0000 [0000] 1111 1111 11 
Good 16 #10 1111 1111 0000 |0000] 11 
Faulty 16 #10 1111 1111 0000 (0000) 11 
Good wil # 5 0000 0000 1111 1111 11 
Faulty mil # 5 0000 1000 1111 1111 11 
Good wll # 8 1111 1111 0000 0000 11 
Faulty wil # 8 1111 1111 0000 0010 11 


observation to line A. However, if two-pack resolution is tolerable, 
one can ignore this two-node loop generated by wire A. Experience 
has shown that a very large number of the problems that graph- 
theoretic analysis points out are of this type. It is reasonably simple 
to handle these problems if reduced resolution is tolerable. When 
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Fig. 15—Resolution vs cost trade-off. 
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COMET analysis points out one of these loops, the engineer can remove 
either the control or observation entry in the connectivity matrix and 
place it in a reduced resolution list. Later, after leveling the graph, 
he will go back and determine a set of suspect packs to be identified 
when a pack is found to be bad. Thus, the most probably faulty pack 
is named along with packs that could have stuck-at-1 outputs. The 
information to do this is available in the previously created reduced- 
resolution list. 

It is likely that “hybrid”? techniques are also possible. A small 
number of faults that would be costly to handle by COMET could be 
fault simulated. The simulation of this small set of faults is much less 
expensive than the totally fault simulated case. This approach, 
however, is open to the same criticism as the traditional exact-match 
TLM. Consistent results may be hard to come by and one also must 
update the fault simulation as any changes are made. 

COMET is obviously of most value when applied completely. 
Engineering considerations may point to this being undesirable. In 
this case, a number of trade-offs and less costly variations on COMET 
can be used. 


V. IMPACT OF COMET 
5.1 Design for maintainability 


The major impact of COMET should be a partial redefinition of de- 
sign philosophies for digital systems. COMET is, after all, an engineer- 
ing technique. It is capable of identifying diagnostic resolution problems 
before the design has been frozen and allows the design engineer to 
consciously evaluate the possible problems. To proceed in the analysis, 
the engineer must determine what is to be done about the problem. 
His decision may lead to the compilation of a table-of-resolution 
problems. This list can give an idea of the overall fault resolution 
possible for a design. Iteration of the process will allow full evaluation 
of the cost/resolution trade-off before the design is committed to 
hardware. With this tool, diagnosability takes its place as a primary 
design parameter with a reasonably well-defined set of trade-offs for 
evaluation. 

Efficient application of COMET depends in part on the orderly 
application of the technique. Analysis should first consider rather 
global functional blocks. The information gained at this stage of 
analysis provides the basis for optimization of the control and observa- 
tion relations. The global analysis will usually determine which of 
the observation edges to retain. It can also help prescribe necessary 
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external control and observation of the system. In many cases, the 
ability to provide only necessary external access can lead to minimiza- 
tion of the interaction of systems and associated sanity-preservation 
problems. 

Once the global analysis phase has been completed, the expansion of 
the global functional nodes on a local (node-by-node) basis can take 
place. The task is to expand the global functional node into identifiable 
entities. COMET is then used within the global node to specify an 
organization that is diagnosable. The problem has been converted to 
one of modularization. By carefully considering the amount of con- 
nectivity of nodes, packaging problems can also be anticipated. 

One outgrowth of the proposed design approach should be an 
increased awareness of the functional aspects of a system. This, in 
turn, should lead to more attempts at functional testing and an in- 
creased ability to do localized test design. COMET requires conscious 
consideration of test design and conscious consideration of the con- 
trollability and observability of an entity. It further requires that 
diagnosis be ordered such that these input/output ports are verified 
prior to testing the entity in question. This information may be 
sufficient to allow test design and verification on a local basis. This 
would lend itself to algorithmic test generation and inexpensive 
verification. The ordering process, if adhered to, could also sub- 
stantially reduce the problem of having to simulate faults in large 
blocks of circuitry for test-design purposes. 

Probably a more important aspect of the functional orientation of 
diagnosis is the emphasis on detection. Traditional diagnostic design 
relies on detection of all single stuck-at types of faults. In integrated- 
circuit technology, this may be a somewhat restricted subset of all 
possible failures. The nonclassical failures may cause trouble in 
traditional diagnosis, even though they are nearly always detectable. 
With COMET, detection is the only thing that is important. 


5.2 Other applications of COMET 


COMET has a number of advantages over normal diagnostic 
procedures. In the simulation experiments, the ability to diagnose 
multiple independent faults was touched upon briefly. This ability 
means that a machine designed using COMET can be tested upon 
installation with the regular diagnostic program. The procedure would 
be to insert a selected number of the highest-order nodes and run the 
diagnostic. The missing packs appear to be disabled. The initial packs 
can be diagnosed and, if no faults are found, the process can be con- 
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tinued. Otherwise, the faulty pack or packs are replaced and the 
process repeated. The ability to isolate nonclassical faults was also 
touched upon briefly. The impact of this on the diagnostic program 
must be evaluated. 

It should be noted that COMET can be used at various levels of 
design and analysis. For instance, the first analysis may assume func- 
tional nodes to be of the subsystem size. At this level of detail, the 
overall system diagnostic philosophy may be evaluated and improved. 
Next, each functional node of subsystem size may be subdivided into 
functional nodes of circuit-pack size. Assuming that the circuit packs 
may eventually be composed of LSI chips on a ceramic, it may be 
feasible to consider subdividing the circuit-pack-sized functional nodes 
into LSI-chip-sized functional nodes. The potential benefit here is in 
the area of factory repair. If COMET is applied at the chip level it 
becomes possible to isolate the faulty LSI chip using factory test 
apparatus. Faulty chip location is done by automatically disabling 
chip outputs and testing the remaining chips. Conceptually, this could 
simplify the problems of repair of LSI packages. 


5.3 Further work 


There are several aspects of COMET that will require further work. 
Most apparent is work intended to automate a large part of the 
analysis and synthesis phases. This work appears to be of an open- 
ended type. A system that iteratively seeks a near optimum solution 
while considering physical design and other constraints would need to 
be a sophisticated system. The system of program aids that has been 
implemented as part of LAMP represents a start in this direction. 

In addition, some thought must also be directed to obtaining the 
functional node definitions. These do not of necessity have to parallel 
the function performed, but can reflect such techniques as bit slicing 
and other packaging considerations. 

The idea of relying on routine exercise procedures to check failures 
in logic disabling is not very appealing. It is possible that improvements 
can be made in this area. 


VI. SUMMARY 


The concept of controllability and observability for organized 
system design to enhance diagnosability has been introduced. It makes 
use of logic disabling of circuit modules (or packs) and the proper 
ordering of functional nodes of a system. Both binary and linear 
ordering can be used: Other arrangements are possible but have not 


LAMP: MAINTENANCE ENGINEERING TECHNIQUE 1533 


been investigated. Graph-theoretic algorithms for determining the 
optimum placement of control and access and monitor points for 
diagnostic testing were presented. 

A feasibility study was performed by applying COMET to an 
existing processor. The capability of locating single, multiple, and non- 
classical faults was demonstrated, based solely on the pass/fail indi- 
cation of test partitions. Additional control and monitor circuitry was 
added to satisfy the requirements of COMET. The added hardware, 
which included both the logic for disabling and for modifications is 
modest (e.g., less than 10 percent). 

For a given set of tests, significant savings in simulation time to 
generate TLM data are achievable for a design using COMET. 
Furthermore, if a laboratory or a field model of the system is available, 
the TLM data generation effort will be almost totally eliminated. 
Data can be generated by actually running the diagnostic program on 
the machine for which a TLM is to be produced. This approach also 
makes TLM update easy and therefore drastically reduces many 
problems caused by machine differences and circuit changes. Finally, 
it offers the possibility to tailor-make TLMs that can be stored 
on-line. 

Application of COMET is best suited for new designs where the 
trade-offs among circuit design, logic packaging, and diagnosability 
can be jointly considered early in the design stage. The use of machine 
aid tools would make this feasible and could greatly facilitate the 
design process. Retrofiting COMET to existing systems may be feasible 
on many designs, but may not be economical on others. Engineering 
judgment must be exercised to study the impact. 
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Specific attempts have been made at Bell Laboratories to shorten develop- 
ment intervals, to improve the quality of system design, and to improve 
unit, manufacturing, and system testing by widespread application of the 
LAMP system during development of the 1A Processor and No. 4 Elec- 
tronic Switching System. LAMP has played a major role in two areas of 
the design process—design verification and fault simulation. Although ten 
major digital subsystems of No. 4 ESS and the 1A Processor are now being 
simulated on the LAMP system, this paper describes the experience gained 
during development of two of the 1A Processor subsystems, central control 
and program/call store. 


1. INTRODUCTION 


Computerized development aids have become an integral part of the 
development of large, complex, electronic systems. One such aid, the 
LAMP system, is finding widespread application in the development 
of the 1A Processor,! a stored program processor, and in No. 4 ESS,? 
a new switching system that employs a solid-state, time-division, 
digital-switching network. 

The need for computerized aids in the development of advanced 
switching systems is vital for several reasons. Vast amounts of engi- 
neering and manufacturing information must be generated. Compli- 
cated design decisions coming from engineers of diverse disciplines 
must be coordinated, and with the increasing complexity of systems 
and electronic technology comes the need for more thorough and con- 
sistent testing at all stages of design and manufacture. As the com- 
puter increases in power, it plays a greater role in reducing manual 
design effort, enhancing design quality, improving the accuracy of 
information transfer, and making more complex designs economically 
feasible. 

1535 


For these reasons, the LAMP system has been used on TSS 360/67 
to simulate ten major subsystems in No. 4 ESS and the 1A Processor. 
The experience gained during the development of two of these sub- 
systems, central control (the heart of the 1A Processor, which provides 
program execution and overall executive control) and program/call 
store (a magnetic core memory unit that provides storage for ESS 
programs and temporary call-related data) is detailed here. 

To facilitate maintenance of design information and provision of 
adequate testing for the project, computerized data bases have been 
implemented for all design information. The combination of common 
data bases for hardware and software design information, the LAMP 
simulator, and the conversational features of an interactive host com- 
puter have proven quite effective in the hardware and test design for 
the project. 

There are two major ways in which the LAMP system has been used 
in the development process: design verification and fault simulation. 
These will be discussed separately and in detail in Sections IJ and IT], 
respectively, of this paper. Briefly, design verification consists of 
demonstrating that the unit being simulated performs the functions it 
was designed to perform, with no faults present. Fault simulation, on 
the other hand, consists of inserting faults into the simulated unit and 
testing the ability of maintenance programs to detect and isolate 
those faults. 

Three categories of test programs are used in the simulation of the 
previously mentioned units. They are: 


(z) Circuit pack level tests, which are used in design verification, 
fault simulation, and pack testing. 

(iz) Diagnostic tests, which are the primary tool for both design 
verification and fault simulation at the complete unit level. 
The diagnostic tests are written in a high-level language, con- 
current with the design of the hardware. They are intended for 
factory and installation tests as well as for on-line fault detec- 
tion and repair in an operating system. Total fault detection is 
the ideal primary goal, with good resolution the secondary goal. 

(wz) ‘Special’ test programs, which are used only for design verifica- 
tion and are intended to test the functional capability of the 
simulated unit (e.g., its ability to execute the program), and 
to test complex interactions between different portions of the 
unit. 


Initially, LAMP was used to simulate each digital circuit pack of the 
1A Processor (2) to verify the design, (7) to verify that sufficient 
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access was available on the pack to detect all classical faults (output 
stuck high or low, open input), and (777) to verify that the tests were 
capable of detecting these faults. This circuit pack level of simulation 
continued throughout the development process as changes were made 
to circuit packs and new packs were issued. 

A second, temporary phase for some units was the simulation of a 
functional part of a unit before the complete design was available. This 
allowed design verification through simulation to begin while other 
functions were being designed. At this level, many special test programs 
described previously were used. 

Finally, as the complete design became available, complete unit 
simulation was begun. At this level, the diagnostic tests were used, and 
the majority of design verification and fault simulation was done. 


Il. DESIGN VERIFICATION 
2.1 Circuit pack design verification 


This section describes the use of LAMP simulation in the design 
verification of 1A Processor circuit packs. This is differentiated from 
design verification at the complete unit level and from fault simulation 
of circuit packs, which will be discussed in later sections. 


2.1.1 Objectives 


Substantial time and expense are required to produce an artmaster 
for a 1A Processor circuit pack (100 to 400 logic gates) and then to 
produce the first hardware version of the pack. It becomes important, 
therefore, to verify the accuracy of the design before this process 
begins. The purpose is to test the ability of the design to perform the 
intended functions as completely as possible. In addition to new designs 
for circuit packs, changes inevitably must be made during the course 
of the project. Again, it is important that the design of these changes 
be verified before the time-consuming process of modifying the hard- 
ware begins. For this reason, design verification of circuit packs, 
through simulation, is done not only early in the development process 
but throughout its course. 


2.1.2 Circuit pack simulation 


The mechanics of circuit pack design verification by simulation con- 
sist of building a LAMP model of the pack, devising a set of tests, 
running the tests and interpreting the results, and updating the model 
so that proposed corrections of design errors may be tested immedi- 
ately. Building and updating the model are described under complete 
unit simulation (Section 2.2.2). 
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2.1.3 Circuit pack tests 


Design verification at the circuit pack level uses tests written by the 
pack designer or another person familiar with the design and informa- 
tion generated automatically via the automatic test generation (ATG) 
program.’ The handwritten tests consist of sets of inputs (vectors) to 
be applied to the circuit pack. During simulation of the vectors, the 
outputs of the pack are continuously monitored and, thus, may be 
compared to a set of expected results. An effort is made to make a com- 
plete ‘‘active’” test of every output, ie., to ensure that each output is 
active when the inputs are selected to make it active. However, ‘‘in- 
active” tests of each output (insuring that the output is not active 
when it should not be) are necessarily limited to those the test designer 
feels to be high-probability cases. An exhaustive set of inactive tests 
can be prohibitively large for even a relatively simple function. 

The ATG program is intended primarily for the generation of factory 
tests for the circuit packs, but it proves useful for design verification as 
well. This program approaches test generation from the same stand- 
point as the LAMP simulator, as is discussed in detail in a companion 
paper.’ One result of this approach to test generation is that the pro- 
gram effectively determines the Boolean function for each output as it 
actually exists on the pack. By printing these functions in a form so 
that they can be compared to the functions intended by the pack 
designer, ATG is an effective tool for design verification of combina- 
tional and many sequential packs. By recreating the function from the 
gate level information, ATG effectively makes both “active” and 
“inactive” tests of the function design. 


2.1.4 Experience 


The simulation of the 1A Processor circuit packs for design verifica- 
tion was not a one-time occurrence, but continued throughout the 
design process. During this simulation procedure, some errors were 
found on a majority of circuit pack codes. Without circuit pack simula- 
tion, these errors would not have been found until complete unit simula- 
tion, or perhaps not until testing of the first hardware model of the 
unit. In most cases, even if the errors were detected at the complete 
unit level of simulation, the expensive task of generating the artmaster 
for the pack would already have been completed. 

The tests developed for circuit pack design verification served as a 
substantial portion of the factory tests for these packs. The process of 
achieving complete fault detection capability uncovers redundancies, 
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thereby enhancing design verification. Section III describes the fault 
simulation of circuit packs. 


2.2 Complete unit design verification 


Inherent in major unit simulation is the need to first verify the 
accuracy of the simulated unit (or simulation model). This is done by 
running the diagnostic tests on the simulated unit with no faults, then 
comparing the simulation results with the expected results for each 
test. This verification procedure proves useful in three ways: it verifies 
the functional design of the unit, it verifies the accuracy of the design 
data base, and it verifies the design and expected results of the diag- 
nostic tests. 


2.2.1 Objectives 


In the design of large digital units, a time lag exists between the com- 
pletion of the initial design and the arrival of the first manufactured 
units. Before the advent of high-speed integrated circuits, the unit was 
breadboarded, in many cases, to allow continuing design feedback 
while waiting for the manufactured unit to arrive. However, with the 
increasing complexity and the higher integration levels of digital sys- 
tems, it may no longer be feasible to breadboard a digital unit for 
design verification purposes. LAMP simulation now provides an al- 
ternative to breadboarding for units as large as 40,000 gates. LAMP 
was selected over other alternatives for speed, flexibility, economy, and 
capability. 

The use of simulation has significantly decreased the design interval. 
Once the initial design is completed, it is easier to make a LAMP 
model of the unit than to manufacture it. Thus, design verification can 
begin well before the factory model arrives. As is discussed later, fea- 
tures in LAMP make testing the simulation model comparable to test- 
ing the hardware unit itself. 

Simulation reduces the number of changes that must be made after 
the unit is manufactured. Because integrated circuits are being used, 
changes no longer involve just adding or deleting wires from a wire- 
wrapped backplane. Now changes may require difficult modifications 
to printed wire or thin-film circuits or to multilayer backplanes. As the 
changes become more complex, they also take more time. Critical 
changes may halt all other debugging until the new change can be 
designed and implemented. The change facilities in LAMP permit 
fixes involving a small number of gates and wires (under 50 changes) 
to be implemented almost immediately. Larger changes can be pre- 
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pared in less than a day, when required. This means that, with simula- 
tion, debugging can continue without significant delay when changes 
are encountered. By simulating early, many changes can be found 
and incorporated into the design before the first unit is manufactured. 

Diagnostic tests are required for the first and all subsequent units in 
manufacturing. One objective of simulation is to verify the diagnostic 
test design before testing the first unit being manufactured. The unit 
test interval can be reduced significantly if it is known in advance that 
the tests are correct and that the problem is a malfunction in the unit 
being tested. 

The remainder of this section discusses how simulation is used for 
logic design and diagnostic test verification, and how it fulfills the 
objectives of decreased design interval and reduction in change ac- 
tivity after manufacture. 


2.2.2 Model building and updating 


Figure 1 is a diagram of the simulation process. As the hardware 
design moves toward completion, the information is encoded into a 
design data base. This data base is used to generate all information for 
the LAMP model, the circuit pack and interconnection drawings, the 
artmasters for the circuit packs, and wiring information for the back- 
plane. When the data base is complete, the simulation model is con- 
structed from the information in the data base. This is done in two 
stages. First, an LSL-LOCAL* description of the unit is generated 

DATA 


LOGIC 
DESIGN 
BASE 


FINAL FINAL 
FEEDBACK FEEDBACK 


a> 


DESIGN 






TRIAL TRIAL 
FEEDBACK FEEDBACK 


bees 


Fig. 1—Design verification. 
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from the data base. This LSL-LOCAL description is then compiled 
into a data set of simulation tables for the model. 

The verification of the data base begins as the simulation model is 
being constructed. Error diagnostics in the programs that create the 
simulation model find some inconsistencies in the data base. Once the 
model has been constructed, more errors can be found by running the 
diagnostic tests. 

To make this process practical and easy, methods to change the 
model must be available. Figure 2 illustrates the change process. To 
have the change officially issued into the central data base and then 
to create a new model is a lengthy task, a result primarily of queuing 
delays in the various sequential steps. The data base has to be updated 
and a new model must be generated. To shorten the model updating 
time and to verify changes before they are officially issued, two other 
methods are used to modify the LAMP model. First, a text editor can 
be used to update the LSL-LOCAL description of the circuit, which is 
then compiled into a new set of LAMP tables. Second, the LAMP 
CKTCHANGE! command can be used to modify the existing tables 
directly. For small units, text editing and recompilation is as fast as 
using CKTCHANGE. Therefore, this procedure is used for small units, 
while for large units, because of its speed advantage, CKTCHANGE 
is used. Any change required is first put into the model by 
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WRITE WORD (XR), DATA (((52525252)) 

READ W@RD (XR), EXPECT (6(52525252)) 
WRITE WORD (YR), DATA (6(77777777)) 

READ WORD (YR), EXPECT (0(77777777)) 
RUN CYCLE (2) 

READ W@RD (XR), EXPECT (0) 


Fig. 3—Sample diagnostic test. 


CKTCHANGE or by editing the LSL-LOCAL description. The change 
is then verified by further simulation. If corrections are found necessary 
after simulation, the change is updated and retried. Only when the 
change is correct will it be added to the official file. This results in 
fewer changes being made to the official design data base, thus lessen- 
ing the chance for error. 


2.2.3 Simulation procedure 


Once a model has been built, tests are needed to verify the correct- 
ness of the model. The diagnostic tests are chosen as the main logic 
verification tests since the test design schedule closely parallels the 
logic design and, thus, most tests are ready at the time the simulation 
model is generated. The tests are written in a high-level language from 
which they are easily compiled into the input language for the simu- 
lator. Each test includes an explicit expected result so, while simulating, 
it is easy to ascertain if the test being simulated is passing or failing. 
Figure 3 is an example of a simple test. The X and Y registers are 
initialized and then read to verify the initialization. The run statement 
executes the test and is followed by a read to determine if the proper 
action occurred during the test. 

Given the LAMP model for the circuit, a set of tests to simulate, and 
the ability to make quick changes, design verification may begin. The 
standard procedure is to simulate a complete functional block of tests 
called a phase. During the simulation, a list of test failures is auto- 
matically produced by LAMP. These failures could be hard errors or 
logic 3 (output in unknown state) propagated to the output. The 
data from the failing tests are analyzed by the logic designer to deter- 
mine the source of the errors. Frequently, the solution to a problem is 
obvious from the test results. In other instances, a follow-up run is 
required to determine the exact cause of the error. 

LAMP provides two basic debugging facilities, an oscilloscope-like 
timing trace and a stop-and-display feature. Some follow-up runs in- 
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volve simulating the failing test and generating an oscilloscope-like 
output trace covering a large number of points. Typically, 100 to 500 
points may be traced. This type of trace is effective when the designers 
know where the error is likely to be. The timing trace is also used ex- 
tensively with special tests of critical timing functions and complex 
sequencer interactions. This output has proved to be the most effec- 
tive technique for uncovering circuit timing problems. 

Another type of follow-up run involves simulating the phase in the 
conversational mode and imbedding stops in the simulator. The stops 
are activated when a preselected gate changes to a specified value. 
When the stop is activated, the simulation is suspended, and the logic 
designer can look at the state of any gate at that instant in time. This 
allows the designer to trace the trouble back to its source. Imbedded 
stops are used if the time of occurrence of the problem is known or if 
the problem is isolated to a particular gating lead changing state for 
an unknown reason. In the first case, a stop is planted at a particular 
time. The simulation is run up to this point and then stopped. Using 
the DISPLAY facility in LAMP, the designer then displays the states 
of critical gates. The DISPLAY command presents the current value 
of the gate along with its fan-in or fan-out gates, plus their logical level 
(value). The values are those at the time the simulation stopped. If 
these gates are in the wrong state, then the values of the inputs are 
checked to trace back the problem. The tracing continues until the 
source of the error is determined. In other cases, the value of a gating 
or data lead is used to activate the stop. This is used when it appears 
that the problem is caused by a function changing state at the wrong 
time. Again, the display facility is used to track the problem back to 
the source. 

Once the trouble is isolated by oscilloscope-like tracing or imbedded 
stops, either the circuit is changed, if there is an error in the logic 
design or in the data base, or the diagnostic test is modified, if there is 
an error in the tests. The simulation is then rerun to verify the correc- 
tion. These two procedures are so effective that, for some problems, the 
designers have preferred debugging the logic and tests via simulation 
even after hardware models are available. Running tests on simulation 
is slower than running the tests on the unit; however, debugging is 
facilitated by the ability to suspend the simulation and to observe 
large numbers of points internal to a circuit pack that are not observ- 
able on the hardware model. This has significantly reduced the design 
verification interval and, by finding the errors during simulation, 
many costly changes have been eliminated. 
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Fig. 4—Central control simulation. 


2.2.4 Experience 

The central control for the 1A Processor was the first and largest 
unit whose logic was extensively verified using LAMP. The experience 
gained in verifying the central control is presented here to illustrate the 
effectiveness of LAMP for logic and diagnostic test verification. Figure 
4 is a functional block diagram of the simulation models and interfaces. 
The 1A Processor contains duplicate central controls. When diagnostic 
testing is performed, the active central control tests the standby cen- 
tral control through an interface port. In the simulation process, the 
LAMP model for the complete standby central control was created. A 
simplistic model of the active central control was created to take the 
tests and apply them to the standby central control at the proper time. 
The active central control model contains an oscillator that could be 
started and stopped, allowing the circuit to settle and a new test to be 
applied at the appropriate time. In the active central control model, a 
comparator circuit was built in to check the actual results of a test 
with the expected results. Whenever the error flag gate became active, 
@ message was printed listing the failing test. For conversational 
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simulations, this allowed termination of simulations if the number of 
failures became too large. 

In addition to the diagnostic tests, a series of special test programs 
was simulated that was designed to test program execution func- 
tionally. Since these programs were not written in the high-level lan- 
guage of the diagnostic tests, they were not simulated in the same way. 
Instead, using LAMP features, a functional model consisting of a pro- 
gram store and a call store was constructed and connected to the gate 
model of the central control. The programs were compiled and loaded 
into the memories using a special memory loader facility. Special 
LAMP vectors were written to initialize the standby central control 
model and then to release the oscillator. This allowed the clock to run 
continuously and to simulate actual program fetching and execution. 
Conversational LAMP control procedures were used to stop the simula- 
tion when the program transferred to the error address or when it 
reached the return address. These programs allowed operational test- 
ing in addition to the diagnostic testing. 

For the central control verification, approximately 20,000 diagnostic 
tests and 4000 words of program were simulated. The simulation of all 
the special tests and most of the diagnostics was completed before the 
first unit arrived from the factory. Through simulation, approximately 
85 percent of the logic and 80 percent of the diagnostic tests were veri- 
fied. The areas remaining to be tested were primarily the circuitry and 
tests that interconnect the central control with its system environ- 
ment. At the present time, this type of testing is beyond the capability 
of LAMP, mainly because of the large number of gates required to 
model all the units connecting with the 1A central control. 

Many diagnostic and logic changes were generated as a result of 
simulation. LAMP is an indispensable part of the development process. 
It has proven effective in reducing development intervals and in re- 
ducing the number of circuit modifications. 


Il. FAULT SIMULATION 


3.1 Purposes 


Fault simulation is necessary to determine (and enhance) the detec- 
tion level of the diagnostic tests and can be viewed as an extension of 
the design verification process. The true behavior of a circuit is studied 
during verification. The set of other possible behavioral responses can 
be ascertained with fault simulation. This process is systematic; re- 
sponses are derived on a fault-by-fault basis. This type of information 
is essential to the 1A Processor subsystems because (7) it contributes 
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to meeting stringent factory test requirements, and (77) the complete 
characterization of subsystem behavior is necessary to satisfy long- 
term in-service system maintenance requirements. 

Fault simulation at the circuit pack level is not complicated and is 
discussed in the next section. On the other hand, unit level simulation 
is quite complex. Many factors are involved, including large gate 
counts, precise timing, complex fault modeling considerations, and 
limited computer resources. These are discussed in the remaining 
sections. 


3.2 Circuit pack simulation 


The size of 1A Processor circuit packs (100 to 400 gates) makes fault 
simulation on LAMP an easily managed process. Simulation models of 
packs are extracted from the central design data base. These models 
seldom require any additional modeling changes. Classical faults are 
simulated for every gate in the circuit. Tests are designed to detect 
every simulated fault, if possible. A mixture of manual test design 
and automatic test generation via ATG is used.’ The process is facili- 
tated through the use of interactive LAMP on the IBM 360/67 or 
370/168 computer. 

Each test consists of one or more input vectors and an expected set 
of outputs. The input vectors are simply strings of 1 and 0 combina- 
tions that are applied to the inputs of the pack. The tests need not be 
functionally arranged, but may be independent of each other. Testing 
is not “clocked,” inputs are applied simultaneously, and outputs are 
examined well after the circuit has settled down. Test minimization is 
not stressed, since the cost of simulating each fault per test is small, and 
redundant test sequences may improve detection of nonclassical and/or 
multiple faults. 

As described in Section 2.1, verification tests were generated using 
a combination of manual and ATG techniques. These tests were then 
run in the fault simulation mode, achieving an average of 75 percent 
fault detection. Additional tests were generated with manual and ATG 
techniques to achieve 100 percent fault detection. 

The ATG program was used on about 50 percent of the packs to 
increase fault detection to about 90 percent although, on some com- 
binational packs (about 10 percent of total), ATG immediately pro- 
duced 100 percent detection. Manual techniques were required to pro- 
vide detection of the last 10 percent of the faults on most packs. During 
this process, design redundancies and other bugs were uncovered. 
Overall, fault simulation at the circuit pack level produced debugged 
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factory tests and helped uncover design bugs, thus significantly reduc- 
ing the intervals required for manufacture. 


3.3 Unit modeling 


The logic model used for unit level design verification forms the 
basis for the fault simulation model. In some cases, it may be necessary 
or desirable to make alterations. Simulated logic used only to support 
verification studies may be removed to save simulation time. It may 
be necessary to manually append additional models of specialized 
circuitry not kept in the design data base. Typical devices are discrete 
analog components such as operational amplifiers, current drivers, and 
signal buffers. This circuitry communicates with the logic and affects 
its state, but is nonlogic in nature. Often, many one-of-a-kind models 
must be constructed through truth table and timing diagram studies. 
In some cases, circuitry exists that is not modeled. The 1A Processor 
call/program store core memory is an example. While functional simu- 
lation can be used to model the memory for design verification at the 
present time, LAMP cannot support fault list propagation into and 
out of a functional model. 

Once the fault model is completed, circuit initialization must be 
considered prior to simulation. Ideally, simulation should begin with 
the circuit in an unknown state. This represents the most accurate 
approximation to the physical circuit whose state prior to testing is 
not necessarily predictable. The unknown state approach is normally 
used for design verification simulation, but not for fault simulation. 
This is because starting from a known state significantly reduces the 
excessive simulation CPU time caused by the potential for large fault 
list buildup during initialization. The resulting loss in accuracy is 
small. 

A known state is normally achieved by applying an initializing 
sequence to the circuit using true value simulation. Fault simulation is 
conducted starting with this true-value state and a set of “null” fault 
lists. 


3.4 Test selection 


Test selection is an important consideration for large unit fault 
simulation because it is costly to simulate an input vector, and each 
test usually expands into a series of from two to ten such vectors. 

The decision concerning which tests to simulate is influenced by the 
particular objective, the circuit model, and the available computer 
resources. Fault simulation to evaluate early tests or to support the 
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improvement of tests (test enhancement) is concerned with the detec- 
tion of classical faults. This objective permits the exclusion of tests 
designed to improve fault isolation or to detect lead shorts (such as a 
walk of 1 through a field of 0’s). Test exclusion is also necessary to size 
the tests according to the capabilities of the simulation model. This is 
most evident in tests that deal with the system environment, such as 
tests of “interunit’’ communication buses or of nondigital circuitry 
such as memory. A set of tests may also be simulated versus only a 
fraction of the faults. This is discussed further in Section 3.6, under 
“simulation strategy.” 

How the tests are sequenced is also important. Functional ordering 
(grouping tests according to the circuitry being tested) is essential to 
make the tests useful as a repair vehicle. The test phase is considered 
the basic functional entity whose predesigned sequence must be main- 
tained. In some cases, tests may be deleted from a phase during simula- 
tion, but the order of remaining tests is preserved. During test evalua- 
tion, phase ordering is not essential. How or when a fault is detected 
is of little consequence at this stage. Functional phase ordering assumes 
greater importance when data are collected to support trouble loca- 
tion dictionaries. Also, tests excluded from earlier stages of simulation 
are included here wherever possible. 


3.5 Fault selection 


The 1A Processor uses conventional TTL integrated circuits for 
logic function implementation. Discrete circuitry augments this logic 
where required. The major subsystems use LAMP to simulate 
“stuck-at”? (input open, output stuck at 0 and at 1) logical faults on 
TTL gates. Shorted fault simulation has, to date, not been used at the 
unit level. The restriction to stuck-at fault simulation is an engineering 
decision. In an actual circuit, the possible failure modes are much more 
extensive, including not only lead shorts but also timing changes, 
voltage changes, intermittents, etc. It is assumed that the majority 
of these faults will behave as stuck-at faults for a portion of the tests. 
Experience with previous electronic switching systems has shown this 
assumption to have validity. 

Since there are several hundred thousand possible stuck-at faults in 
1A Processor subsystems, the fault selection process must be facilitated 
by various options available in LAMP. Depending upon application, 
faults are selected on an individual basis by gate name, by circuit pack, 
or to a certain extent by hardware function. 
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When detection information is being sought, random sampling is 
used. A sufficiently large random sample has been found to predict 
detection levels accurately. According to requirements, samples are 
selected on a uniform, localized, or stratified basis. 

Several automatic fault administration options in LAMP have re- 
duced 1A Processor simulation costs significantly. Early termination 
is used extensively during test evaluation and enhancement. Under 
this option, a fault is terminated (removed from simulation) after one 
“hit,” or detection. Typically, this option saves from 50 to 75 percent 
of simulation time. Fault collapsing removes n — 1 out of n logically 
equivalent faults from the fault set being simulated. For example, if 
five faults cause the same effect upon a logic circuit, LAMP simulates 
one of the five. Typically, this option reduces the fault set by 25 to 50 
percent. Undetectable fault elimination, which removes faults such as 
those on unused gates, reduces fault sets up to 10 percent. 

Even after every attempt to prune the fault set has been exhausted, 
it is still necessary to form fault partitions for typical large unit simula- 
tion runs. The degree of partitioning necessary depends upon the size 
of the unit, the tests, the circuit topology, and the computer resources. 
The primary factor influencing fault partitioning is the available com- 
puter main memory. Using an IBM 360/67, typical fault sets are in the 
3000- to 6000-fault range. Using an IBM 370/168, sets as large as 
30,000 faults have been simulated. 

On the other hand, test partitioning (the smallest partition being a 
phase) is primarily influenced by simulation time limits. Some par- 
titions may require 1 or 2 hours of processor time. 

For protection, the LAMP CHKPOINT/RESTART facility is used 
to permit rollback in the event of a computer, simulator, or procedural 
failure.‘ 


3.6 Simulation strategy 


An interactive simulation procedure is used in the 1A Processor for 
test enhancement. A random sample of faults is selected and simulated 
with the diagnostic tests. The results are analyzed to reveal undetected 
faults for which additional tests are designed. A new sample is then 
selected, consisting of the previous undetected faults plus an additional 
random sample, and the process is repeated. The size of a fault sample 
is generally larger than the minimum required to predict detection 
levels. This increases the probability of revealing classes of undetected 
faults. Ultimately, large classes are eliminated, and a point of diminish- 
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Fig. 5—Two-pass diagonal simulation. 


ing returns is reached. At this stage, complete fault detection informa- 
tion is secured by simulating all remaining faults with the complete set 
of tests. 

Selection of several fault partitions (such as a random sample di- 
vided into four parts) that must be simulated against many phases, a 
phase at a time, requires the execution of many successive simulation 
runs. The most obvious way to execute these runs is to simulate the 
various combinations of fault and test partitions in succession. This 
approach does not prove efficient, and significant cost reductions are 
possible through an alternate strategy called ‘two-pass diagonal 
simulation.” Figure 5 is an illustration of this method. In the ideal 
situation, 7 mutually exclusive fault partitions and 7 test partitions are 
selected. Each fault partition is chosen in the area of circuitry that a 
corresponding test partition is designed to exercise. This correspon- 
dence significantly increases the probability of fault detection per 
second of processing time. Pass 1 simulation then consists of 7 runs, 
not 7?. Early termination is used to remove detected faults prior to 
pass 2. In pass 2, undetected faults are collected and repartitioned into 
a smaller set of 7 runs, and all combinations of partitions are simulated. 
The effectiveness of this method lies in the fact that, for a minimum of 
resources, the great majority of faults are detected in pass 1. From a 
practical point of view, mutually exclusive fault sets with clear test 
associations are hard to produce. Instead, overlapping fault partitions 
are usually selected. Pictorially, this means that, in Fig. 5, regions off 
the major diagonal would be lightly shaded. This reduces slightly the 
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efficiency of the method, but the savings are still significant. Results 
have shown that tightly connected circuits benefit least from this tech- 
nique because the concept of localized fault detection tends to break 
down. 

The method just described is used primarily for simulation where 
detection information is being sought. In other applications such as 
data collection for trouble location dictionaries, the procedure of Fig. 5 
is not used since it produces incomplete fault behavior information. 
In such cases, the simple strategy of simulating all fault and test parti- 
tions may be utilized with corresponding CPU time increases. 


3.7 Experience 


A composite LAMP model of the 1A Processor call/program store 
was constructed to support design evaluation and diagnostic develop- 
ment. Figure 6 is a block diagram showing interrelations of distinct 
portions of this model. 

The model contains approximately 10,000 LAMP gates. This count 
is about 40 percent higher than the real gate count because of attendant 
logic controlling bus interaction and specially modeled ‘‘nonlogic’’ 
circuitry. This model was verified in the manner described in Section 
II. A preliminary diagnostic was designed consisting mainly of func- 
tional exercise tests, which translated into about 8000 LAMP input 
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Fig. 6—Program/call store lamp model. 
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Table |— Program store simulation data 





Statistic 15,000 Faults 580-Fault Sample 
Number Detected * 10,400 400 
Number Race Faults 2200 83 
Number Oscillation Faults 80 3 
Detection with Races Removed 81% 80% 
Detection Estimate Following First 90% 90% 
Test Enhancement Iteration 
Estimate of Ultimate Detection Level 96% 96% 


* Using initial set of tests. 


vectors distributed over a number of phases. At the time, the inability 
to model the core memories with LAMP gates necessitated omission of 
several test phases. 

A study was conducted using the model of Fig. 6 and these tests to 
meet the following objectives: 


(t) Determine the detection power of the tests. 

(it) Provide data to support test enhancement. 
(77) Evaluate proposed methods of data collection for use with other 
units (this was a pilot study for large unit fault simulation). 


The model contained a set of 15,000 meaningful classical faults. As a 
first experiment, this set was simulated using the two-pass diagonal 
simulation method of the previous section. As a second experiment, a 
random sample of 580 faults was selected and simulated against all 
tests. In the first experiment, the number of faults in each partition 
was chosen to optimize use of the host computer. For the random 
sample, it was possible to simulate the 580 faults in one partition. 
About eight partitions were selected for the 15,000-fault experiment, 
each containing on the average 4000 faults. Significant overlapping of 
fault partitions was necessary because the minimum faultable unit 
was selected as a circuit pack (out of convenience) which often en- 
compassed several functions. 

About 24 CPU hours of IBM 360/67 computer time were required 
to complete the above experiments. This included the time required to 
restart several runs because of procedural errors and system crashes. 
Through some additional runs, it was possible to show that diagonal 
simulation saved about 50 percent of the CPU time that would have 
been required to simulate all tests versus all faults. 

Table I contains some simulation results and estimates for the two 
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experiments. It is important to emphasize the fact that the tests were 
an initial cut used to support early factory testing of frames. The 
resulting detection level of 81 percent was about as expected at this 
stage of design. The most significant result was the relatively large 
number of race faults encountered. The LOGIC simulator® was used 
for the study to save computer time. This resulted in race faults (those 
causing indeterminate gate or output states) and oscillation faults 
(those leading to circuit oscillation) being removed during simulation 
when encountered. Subsequent studies using the FAULT simulator? 
showed that most race faults were in reality noncritical races that did 
not cause indeterminate output states. Furthermore, the race faults 
were actually detected at about the same level as normal faults. In 
Table I, it was, therefore, reasonable to remove them from the sample 
in computing the actual detection level. 

Oscillation faults are a serious problem that demand careful study. 
They potentially jeopardize system operation by causing interference 
with other units on the system buses. Although such faults would prob- 
ably be detected in the real system, they were not considered as such 
in Table I. 

The undetected faults in the random sample were carefully analyzed 
and led to some interesting results. Fourteen major classes of unde- 
tected faults (with respect to the 15,000 fault sets) were categorized. 
In most cases, these classes consist of similar faults on repetitive func- 
tions, such as successive bits of a register, which require a few tests for 
detection. In some cases, other faults were revealed that implied the 
design of a class of new tests. Several one-of-a-kind faults, each requir- 
ing unique tests, were also revealed by the analysis. About 2 percent of 
the faults were undetectable for various reasons. Sixty percent of the 
faults were associated with operational logic, and the rest with main- 
tenance circuitry. 

It is estimated in Table I that new tests designed because of the 
results of the random sample analysis will reduce the undetected fault 
set to about 10 percent. It is also estimated that, through the use of 
LAMP in this iterative process, no more than three iterations will be 
required to achieve an ultimate detection level of 96 percent of the 
classical faults. 

Resolution of the remaining 4 percent, which are truly undetectable, 
pose a problem. Some represent true circuit redundancies (in the 
simplest case, a single gate output feeding a gate twice). Others, more 
subtle, reside in circuitry used to improve noise or electrical margins. 
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These might be detected under worst-case conditions by existing tests. 
A third class deals with system constraints (inputs constrained not to 
assume certain state combinations). 

It is impractical to consider complete removal of these faults through 
design changes. In any event, LAMP has done its job by categorizing 
these faults. Maintenance information can at least be provided to help 
deal with the possibility of their existence throughout the life of the 
system. 


IV. CONCLUSION 


When LAMP was first introduced, it received almost immediate 
acceptance and support from circuit and diagnostic program design 
groups, although many growing pains were involved with its use. On 
countless occasions, the user community taxed both LAMP and its 
host computer resources to their limits. In response to this, LAMP has 
grown and matured, making significant improvements in capacity, 
speed, and capability. 

The use of LAMP to verify the paper design of 1A Processor sub- 
systems significantly reduced laboratory debugging intervals and pro- 
vided major cost reductions. Logic design errors were located and cor- 
rected prior to the construction of initial hardware. In association with 
this, the “first iteration” of diagnostic design, the debugging of func- 
tional tests using LAMP simulation, was completed prior to the avail- 
ability of system laboratories. These tests were then used to test the 
frames in the factory environment. 

The availability of interactive LAMP has been a significant aspect 
of design verification. The option to freeze the state of a logic simula- 
tion in order to examine internal nodes has proved so powerful that 
circuit designers have sometimes preferred this facility to the actual 
unit as a debugging tool. 

The LAMP fault simulator has been essential to the development 
of complete circuit pack test vectors for the 1A Processor. The exten- 
sion of fault simulation to large subsystems is just beginning. Initial 
fault simulation studies using the 1A Processor program/call store are 
encouraging. Iterative test enhancement using LAMP will insure the 
detection or classification (as to reason for not being detected) of every 
stuck-at logical fault. This is of primary importance because of the 
very stringent maintenance requirements of the 1A Processor. 

In the future, the trend toward higher scales of logic integration will 
increase the use of LAMP for design verification and factory test 


1554 THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1974 


development. The use of LAMP as a breadboard will become a prac- 
tical necessity. 
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Rain-Induced Cross-Polarization at Centimeter 
and Millimeter Wavelengths 


By T. S. CHU 
(Manuscript received March 13, 1974) 


Rain-induced cross-polarization is an important factor in design of 
dual-polarization microwave radio communication systems. We present 
current estimates of this effect based upon calculated differential char- 
acteristics of canted oblate raindrops and their relationship to experiments. 
Measured differential attenuation and cross-polarization, mainly at 18 
GHz, are used to determine two emptrical parameters: an effective average 
of the absolute value of the canting angle and a measure of the imbalance 
between positive and negative canting angles. We can then provide estimates 
for median values of cross-polarization discriminations at other frequen- 
cies; these are found to agree fairly well with available measured data. 

Differential phase shift is the dominant factor in the rain-induced cross- 
polarization at frequencies below about 10 GHz, and differential attenua- 
tion becomes increasingly important at higher frequencies. For a given 
rain fading, the cross-polarization decreases with increase in frequency 
and is relatively insensitive to the rain rate, whereas for a given amount of 
rain the cross-polarization increases with frequency up to about 385 GHz. 
The cross-polarization discrimination of circularly polarized waves is 
much poorer than that of linearly polarized waves. When the angle a 
between the direction of propagation and the axis of symmetry of oblate 
raindrops is not equal to 7/2, as on earth-space paths in satellite communt- 
cation systems, the differential attenuation and differential phase shift 
can be approximated by sin? a times those for a = 7/2, which is the con- 
dition for terrestrial paths. 


|. INTRODUCTION 


Understanding depolarization properties of the transmission medium — 
is of crucial importance in planning frequency reuse by employing 
orthogonal polarizations in a radio communication system. The rain- 
induced depolarization, which concurs with heavy rain attenuation, 
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Fig. 1—Canted oblate spheroidal raindrop. 


has attracted considerable theoretical and experimental efforts.1~-® A 
mathematical model of canted oblate spheroidal raindrops as shown in 
Fig. 1 has been assumed in most thcorctical investigations. Cross- 
coupling between vertical and horizontal polarizations occurs as a 
result of the differential attenuation and differential phase shift be- 
tween two polarizations parallel and perpendicular to a major axis of 
the oblate raindrops. Recently, Morrison et al.!°"! have given the cal- 
culated differential characteristics for various rain rates throughout 
the microwave range from 4 to 100 GHz. These results are based upon 
numerical solutions” of the scattering of a plane electromagnetic wave 
by oblate spheroidal raindrops using a point-matching procedure or 
perturbation about an equivolumic sphere. The modified perturbation 
scheme!" offers a substantial improvement over Oguchi’s previous 
perturbation calculations.! Very recently, Oguchi!- also used a point- 
matching procedure to make similar calculations. The purpose of this 
paper is to assess our current understanding of the rain-induced micro- 
wave depolarization in view of these new calculations and their rela- 
tionship to the measured data. 


Table |— Attenuation and phase shift at 4 GHz with « = 90° 


Rain Rate At Ai PI Pir 
in mm/hr in dB/km in dB/km in deg/km in deg/km 
0.25 1.825 * 1074 | 2.013 * 1074 1.413 & 107! 1.487 107} 
1.25 7.806 X 107? | 8.931 X 1074 5.494 * 107} 5.894 * 107! 
2.5 1.506 X 1073 1.759 X 1073 9.933 X 107 1.077 
5.0 2.980 X 107% 3.570 X 107% 1.806 1.984 
12.5 7.581 X 1073 9.435 X 1073 3.995 4.478 
25.0 1.610 * 107? | 2.089 x 107? 7.377 8.417 
50.0 3.547 X 107 | 4.836 X 107 | 13.83 16.10 
100.0 8.038 107? 1.164 X 1071 | 26.15 31.10 
150.0 1.3824 * 107! | 2.021 X 107! | 38.85 46.23 
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Table Il — Attenuation and phase shift at 11 GHz with a = 90° 


Rain Rate Ay Ai ®y Pry 
in mm/hr in dB/km in dB/km in deg/km in deg/km 
0.25 2.428 X 1073 | 2.669 x 1073 3.985 X 107} 4.195 X 107! 
1.25 1.592 X 107? 1.820 X 107? 1.579 1.697 
2.5 3.787 X 107? 4.399 x 107 2.880 3.127 
5.0 9.144 « 107? 1.076 X 107 5.266 5.783 
12.5 2.907 X 107} 3.470 X 107 | 11.69 13.06 
25.0 6.898 < 107! 8.293 X 107! | 21.382 24.18 
50.0 1.605 1.945 38.94 44.93 
100.0 3.586 4.392 70.25 82.58 
150.0 5.605 6.919 99.26 118.3 


The considerable uncertainty about the canting-angle distribution 
can be characterized by two parameters as suggested by Thomas.? The 
first parameter, which is an effective average of the absolute value of 
the canting angle, can be determined by comparing the calculated 
values with the measured differential attenuation between vertical 
and horizontal polarizations. The second parameter, which is a mea~ 
sure of the imbalance between positive and negative canting angles 
with respect to the vertical direction, can be determined by comparison 
between the calculated and measured cross-polarizations. The re- 
liability of empirical parameters will be much improved by recent 
developments in theory and experiment. Then the theoretical model 
provides systematic extrapolation of the measured data. 

Section II tabulates the details of the calculated attenuation and 
phase shift that were abbreviated in Ref. 10. The normalized differ- 
ential characteristics with respect to rain fading give physical insight 
and the main features of the rain-induced cross-polarization. Section 
III describes the calculation of depolarization for both linearly and 


Table II] — Attenuation and phase shift at 18.1 GHz with a = 90° 
Rain Rate At Au by Pry 
in mm/hr in dB/km in dB/km in deg/km in deg/km 
0.25 9.797 * 10-3 | 1.071 xX 107 6.674 X 107] 7.029 x 107! 
1.25 6.483 * 107? | 7.275 X 107? 2.608 2.801 
2.5 1.458 X 107! | 1.658 x 1073 4.680 5.078 
5.0 3.205 X 107} 3.702 X 107! 8.362 9.182 
12.5 8.927 X 10-2 | 1.055 17.78 19.90 
25.0 1.874 2.273 31.33 35.63 
50.0 3.869 4.846 55.24 63.88 
100.0 7.696 10.04 97.39 114.1 
150.0 11.5 15.39 137.0 161.4 
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Table IV — Attenuation and phase shift at 30 GHz with a = 90° 


Rain Rate Ax Ai 1 Pir 
in mm/hr in dB/km in dB/km in deg/km in deg/km 
0.25 3.347 X 10°? | 3.660 X 107? 1.102 1.160 
1.25 1.960 X 107! | 2.221 K 1072 4.134 4.428 
2.5 4.121 X 1071 | 4.757 x 107! 7.220 7.789 
5.0 8.506 X 107 | 1.001 12.51 13.57 
12.5 2.179 2.634 25.26 27.56 
25.0 4.321 5.321 42.54 46.31 
50.0 8.444 10.58 71.16 76.72 
100.0 15.96 20.25 118.6 125.2 
150.0 23.05 29.37 161.3 167.7 


circularly polarized waves. Section IV determines empirical param- 
eters of the canting-angle distribution. Estimates are made for the 
median values of the rain-induced depolarization at centimeter and 
millimeter wavelengths. An approximation for the case of oblique 
propagation is also examined. 


Il. DIFFERENTIAL ATTENUATION AND DIFFERENTIAL PHASE SHIFT 


The ratio of minor to major axes of the oblate spheroidal raindrop 
is assumed to be linearly dependent upon the radius @ (in centimeters) 
of the equivolumic spherical drop; specifically, a/b = 1 — a. This 
relationship is a simple approximation for the experimental data of 
the drop shape.!® Morrison and Cross” have used a least-squares-fitting 
procedure to calculate the complex forward scattering functions S1(0) 
and S11(0)!° for the two polarizations I and II parallel and perpendicu- 
lar to the plane containing the axis of symmetry of the raindrop and the 
direction of propagation of the incident wave. They have given nu- 
merical tables of forward scattering functions for all the raindrop sizes 


Table V — Attenuation and phase shift at 30 GHz with a = 70° 


Rain Rate Ay Ar Pr Pry 
in mm/hr in dB/km in dB/km in deg/km in deg/km 
0.25 3.372 X 1072 | 3.648 x 107 1.108 1.160 
1.25 1.984 X 107! | 2.215 x 1071 4.170 4.433 
2.5 4.183 X 1071 | 4.746 x 107! 7.293 7.803 
5.0 8.664 X 1071 | 9.996 x 107) 12.66 13.61 
12.5 2.230 2.633 25.61 27.67 
25.0 4.440 5.326 43.18 46.53 
50.0 8.712 10.60 72.26 77.19 
100.0 16.53 20.33 120.3 126.1 
150.0 23.91 29.50 163.4 168.9 


1560 THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1974 


Table VI — Attenuation and phase shift at 30 GHz with a = 50° 





Rain Rate Al Air Py rr 
in mm/hr in dB/km in dB/km in deg/km in deg/km 
0.25 3.433 X 107 3.617 XK 10-2 1.127 1.161 
1.25 2.045 X 107! 2.199 * 107! 4.272 4.447 
2.5 4.343 < 107 4.719 X 1072 7.501 7.840 
5.0 9.068 X 107 9.959 x 107! 13.07 13.70 
12.5 2.362 2.632 26.57 27.93 
25.0 4.746 5.339 44.90 47.11 
50.0 9.398 10.67 75.17 78.36 
100.0 17.97 20.52 124.7 128.3 
150.0 26.10 29.84 168.8 172.0 


of the Laws and Parsons distribution. The rain-induced attenuation 
and phase shift are obtained from the forward scattering functions as! 


Ara = 0484* ¥ ReStn(0)n(@) —— (4B/km) (1) 


a S36 ‘, Im Sr21(0)n(@)  (deg/km), ~——(2) 


where A is the wavelength in centimeters and n(@) is the number of 
drops with equivolumic radius @ per cubic meter. For a = 90°, the 
attenuation and phase shift for various rain rates at 4, 11, 18.1, and 30 
GHz have been listed in Tables I to IV. Here a is the angle between the 
direction of propagation and the axis of symmetry of the raindrop. The 


Table VIl — Index of refraction of water at 20°C (computed 
from a recent empirical equation in Ref. 17) 
Frequency in GHz Index of Refraction 
4 8.77 + 0.9157" 
5 8.685 + 1.1957 
6 8.574 + 1.3997 
8 8.319 + 1.7611 
11 7.884 + 2.1847 
14 7.437 + 24771 
18.1 6.859 + 2.7162 
20 6.614 + 2.780% 
24 6.151 + 2.8497 
30 5.581 + 2.8487 
40 4,886 + 2.7251 
60 4.052 + 2.393% 
100 3.282 + 1.8647 


* Since the calculations at 4 GHz were made at an earlier date, the 4-GHz refractive 
index was taken from the older literature (Ref. 18). 
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Fig. 2—Calculated attenuation coefficients of polarization I from perturbation 
about an equivolumic sphere. 


cases a = 50° and 70° at 30 GHz have been listed in Tables V and VI. 
The differential attenuation and the differential phase shift were pre- 
sented graphically in Ref. 10, whereas the above tables are documented 
here for the reader interested in more details. Most results of this paper 
belong to the case a = 90°, which is pertinent to terrestrial microwave 
relay systems. However, we discuss later an approximation for other 
values of a that are of interest in satellite systems. 

Since the results of a modified perturbation scheme showed close 
agreement with those of the least-squares-fitting procedure, the rain- 
induced differential attenuation and differential phase shift based 
upon that perturbation scheme have been graphically presented" for 
various rain rates from 4 to 100 GHz. As a supplement to Ref. 11, the 
refractive indices used for the calculations are listed in Table VII, 
computed from an equation in Ref. 17. For a given rain rate, the differ- 
ential attenuation increases with frequency until about 35 GHz, 
whereas the differential phase shift peaks around 20 GHz and then 
decreases sharply to negative values for millimeter wavelengths. The 
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cross-polarization is expected to increase with frequency until about 
35 GHz for a given amount of rain. 

As reference, Fig. 2 presents A; obtained from the modified pertur- 
bation scheme. Combining Fig. 2 with the differential attenuation data 
of Ref. 11 results in Ay. In comparison with Setzer’s data,!® we note 
that the attenuation by equivolumic spherical drops lies between Ax 
and Ay, but closer to Ary. Since the average of the absolute value of 
the canting angle is about 25°, as demonstrated later, the attenuation 
of vertical polarization will be greater than Ay, whereas the attenua- 
tion of horizontal polarization will be less than An. 

A communication system is usually designed for a certain margin of 
fading. In propagation experiments, attenuation and cross-polarization 


0.5 


0.4 


0.3 


(Ag—Ar) Ay 


0.2 


0.1 


4 6 8 10 20 40 60 80 100 
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Fig. 3—Normalized differential attenuation with respect to Ax. 
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Fig. 4a—Normalized differential phase shift with respect to Ar (4 to 30 GHz). 
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Fig. 4b—Normalized differential phase shift with respect to Ar (30 to 100 GHz). 


are often simultaneously measured for correlation with each other. 
The above two considerations suggest the normalization of both differ- 
ential attenuation and differential phase shift with respect to the 
attenuation of polarization I as shown in Figs. 3, 4a, and 4b. These 
presentations invite a number of observations, as described below. 

It is well known that the rain attenuation of longer microwaves such 
as 4 GHz is slight. Therefore, the relatively high ratio for the nor- 
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malized differential attenuation in this frequency region will make 
little contribution to depolarization. However, the very high differ- 
ential phase shift at 4 GHz in Fig. 4a indicates that significant de- 
polarization is possible even for only a few decibels of attenuation that 
can occur on a long path, say 40 km between two repeaters. Barnett® 
has reported experimental observations of 4-GHz depolarization 
during heavy rain. Here, the differential phase shift is indeed the 
dominant cause of rain-induced depolarization. 

For a given fade, the differential phase shift declines sharply with 
the increase in frequency, whereas the normalized differential attenua- 
tion is relatively insensitive to frequency. Therefore, the differential 
attenuation becomes increasingly important as the frequency increases. 
The differential attenuation in nepers and the differential phase shift 
in radians are about the same at about 20 GHz. The sharp descent of 
the differential phase shift also implies less depolarization at higher 
microwave frequencies for a given fade. 

We note that the decline of the differential phase shift continues into 
negative values at millimeter wavelengths as shown in Fig. 4b. Al- 
though the absolute accuracy of the differential phase shift above 30 
GHz is somewhat doubtful in view of its dependence on the cancella- 
tion between large numbers, we should expect small differential phase 
shift per decibel of fading at millimeter wavelengths. Furthermore, 
the normalized differential attenuation also becomes small because 
smaller rain drops of less ellipticity are contributing heavily to the 
attenuation at shorter wavelengths. Delange et al.!° observed only 
2-dB differential attenuation between vertical and horizontal polari- 
zations with a rain fading of 40 dB at 60 GHz. 

As the rain rate decreases, the normalized differential attenuation 
of each frequency decreases. On the other hand, the differential phase 
shift per decibel of fading generally increases with the decrease of the 
rain rate. These two opposite trends tend to keep the depolarization 
relatively insensitive to the rain rate, as is shown in Section 4.2. 


Il. CALCULATION OF DEPOLARIZATION 


Let the axis of symmetry of the oblate raindrop be oriented with 
respect to the vertical direction at an angle 6, called the canting angle, 
which is discussed later. We now calculate the depolarization resulting 
from the anisotropy described in the preceding section. In practice, 
dual-polarization radio communication systems employ either two 
orthogonal linear or circular polarizations. The orthogonal linear 
polarizations are usually aligned in the vertical and horizontal direc- 
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tions. The polarization transformation of an anisotropic medium can 
be conveniently obtained by listing the following matrix operations: 


7) _ fcos@ —siné\/T. 0 ( cos@ sin ) H, (3) 
V./_+~\sin@é cos@/\ 0 T1/\—sinée cos@/\V;/’ 
where 7’; and 7, are the transmission coefficients over a path of length 
L for polarizations II and I, 


T, = exp [— (a2 — j62)L], (4) 
Ti = exp [— (a1 — j@i)L}. (5) 


Carrying out the multiplication of the matrices yields the relationship 
between input (transmitted) and output (received) polarizations: 


eG? (6) 


I 


I 


where 
GQnn = T2 cos? 6 + Ti sin? 6, (7) 
Qo» = Ti cos? 6+ T2 sin? 6, (8) 
tis = 64 = (42>) sin 26. (9) 


The following expressions are given for the convenience of com- 
putation : 


lann|? = exp — (a1 + a2)L ](D + C cos? 20 — Ecos 28), (10) 
|@ov/? = exp [— (a1 + a2)L |(D + C cos? 26 + E cos 26), (11) 


|ano |? = exp [— (a1 + a2) L] C sin? 26, (12) 
where 

C = cos? BL sinh? aL + sin? BL cosh? aL, (13) 

D = cosh’? aL cos? BL + sinh? aL sin? BL, (14) 

E = 2 coshal sinh aL. (15) 


The differential attenuation coefficient and the differential phase shift 
coefficient are 2a = az — a1 and 28 = Be — fi, respectively. When aL 
(in nepers) and LZ (in radians) are small, eq. (18) may be approxi- 
mated by C = (aL)? + (BL). 

Since both aa, and a,, are independent of the sign of 6, an average 
of the absolute value of the canting angle can be obtained by compar- 
ing the calculated |@nn/a.»| for various 6 with the measured differential 
attenuation between vertical and horizontal polarizations. The ex- 
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pressions for da» = @», indicate that the cross-polarized components 
resulting from positive and negative canting angles tend to cancel each 
other. Furthermore, Saunders? found from photographs of falling rain- 
drops that the canting angle is not far from an even distribution be- 
tween the positive and the negative senses. Then the cross-polarization 
coefficient |a,.| or |as,| should be reduced by a factor e which is a 
measure of the imbalance between positive and negative canting angles. 
Having first obtained the average of the absolute value of the canting 
angle, e can be determined by comparison between the calculated and 
measured cross-polarization data. 

Then, when transmitting horizontal and vertical polarizations the 
crosstalk discriminations are 


Qhv\? eC sin? 26 











as ea PRO ae 2 ee es 
XTDH = « Gnh D+ C cos? 26 — EF cos 26 (16a) 
for receiving horizontal polarization, and 
_ .2| Gor 2 _ eC sin? 26 
STOVE e Avy D + C cos? 26 + EF cos 26 (16b) 








for receiving vertical polarization. The crosstalk discriminations are 
important information for communication systems. In propagation 
experiments, we often receive both horizontal and vertical polariza- 
tions with one transmitted polarization. The cross-polarization dis- 
criminations are 


‘ eC sin* 20 








a) OR fee as en a oe 
eee Sse Ghh D+ C cos? 26 — E cos 20 (17a) 
for transmitting horizontal polarization and 
— 2(Gao|? _ eC sin? 26 
ek a D + C cos? 26 + E cos 26 (17) 








for transmitting vertical polarization. The crosstalk discriminations in 
(16) are numerically the same as the cross-polarization discriminations 
in (17). Experimental confirmation of this theoretical equivalence 
based on the simple model has been reported by Watson and Arbabi.” 

The attenuation is greater in horizontal than in vertical polarization 
because the effective average of the absolute value of raindrop canting 
angle is estimated to be about 25°, as shown in the next section. Then 
XPDH will be poorer than XPDV. Since an, = ayn, the difference 
between XPDH and XPDV for the same rain is expected to be the 
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Fig. 5—Comparison between calculated and measured differential attenuation 
at 18 GHz. 


same as the differential attenuation between horizontal and vertical 
polarizations. 

Now let us examine the case of circular polarization. The polariza- 
tion ratio V/H is first obtained from the following transformation: 


(¥) -(@" &)(5): (8) 


The ratio of the desired circular polarization to the undesired rain- 
induced circular polarization is”! 


1-jV/H_ TM+T: 


ed i Be e7 728 
ifn Bam (19) 


The cross-polarization discrimination of circular polarization becomes 


T. —- T1 
T2+ Ti 


a a 
XPDC = e326 e782, (20) 





= (| Qon/ann|?o—ss°) 








where the mean value of e#° is taken over the canting angle distribu- 
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tion. If all the oblate raindrops are oriented at a single canting angle, 
XPDC is equal to | a.4/ana |? with 6 = 45°. In view of the uncertainty of 
the canting angle distribution, comparison between the measured and 
calculated cross-polarizations will be used to estimate this reduction 
factor |e7°|?. We also note that the cross-polarization discriminations 
of two circular polarizations should be equal to each other. 


IV. NUMERICAL RESULTS AND DISCUSSIONS 
4.1 XPD of vertical and horizontal polarizations 

We first estimate an effective average of the absolute value of the 
canting angle by comparing the calculated differential attenuation 


with the measured data. Such comparisons are shown in Figs. 5 and 6 
for 18 and 30 GHz. The calculated curves assume that the canting 


PATH LENGTH = 1,89 km 


DIFFERENTIAL ATTENUATION, Ay - Ay !N dB 





0 10 20 30 40 50 
SUM ATTENUATION IN dB 


Fig. 6—Comparison between calculated and measured differential attenuation 
at 30 GHz. 
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angles of all raindrops are oriented at 20°, 25°, and 30° from the verti- 
cal direction. A uniform rain rate has been also imposed over each 
measured path in the calculations. 

The 18-GHz data in Fig. 5 are obtained by W. T. Barnett’? and 8. H. 
Lin” in Georgia. R. A. Semplak’s 30-GHz measurement in New Jersey 
is shown in Fig. 6, where the dots indicate the scatter of the original 
data and the crosses are median values at 5-dB intervals. The abscissa 
in Fig. 5 is simply the attenuation of vertically polarized waves, 
whereas the one in Fig. 6 is the sum attenuation, $(|@a,|? + |@v»|*), 
which represents the received power sum of the horizontal and vertical 
components of a transmitted wave linearly polarized at 45° from the 
vertical direction. It is evident from the comparisons in Figs. 5 and 6 
that 25° is a good estimate for an effective average of the absolute 
value of the canting angle. Substituting [@| = 25° in (10) and (11) 
yields the calculated attenuation of vertically and horizontally polar- 
ized waves as shown in Fig. 7. We note that an effective average of the 


40 


20 


ATTENUATION COEFFICIENT IN dB/km 
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Fig. 7—Attenuation coefficients of vertically and horizontally polarized waves. 
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CURVES CALCULATED AT 18.1 GHz WITH 
OBLATE RAINDROPS OF 25° CANTING ANGLE 
FOR A §.07-km PATH OF UNIFORM RAIN 


XPDH IN dB 


+ MEASURED 17.71-GHz MEDIAN 
OF 16 RAINSTORMS OVER 


A 5.07-km PATH (BARNETT) 


O MEASURED 18.35 GHz OVER 
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Fig. 8—Comparison between calculated and measured cross-polarization dis- 
criminations at 18 GHz. 


absolute value of the canting angle should not be mistaken for the 
mean canting angle,?? which is much smaller.* 

Next, an estimate is made of the imbalance between positive and 
negative canting angles by comparing the calculated cross-polarization 
discrimination with the measured data. The calculated curves in Figs. 
8 and 9 are computed from (17) with |@| = 25° and e assumed as 0.1, 
0.2, and 0.3. Uniform rain rates have been assumed over the propaga- 
tion path in the calculations. The crosses in Fig. 8 are Barnett’s mea- 
sured 17.71-GHz median XPDH of 16 rain storms in Georgia.’ The 
dots from Semplak’s 18.85-GHz measurements over a 2.6-km path 
are given in Fig. 8 to provide a check. The measured XPDV points in 
Fig. 9 are deduced from polarization rotation measurements” at 30.9 
GHz, where the effect of the differential phase shift has been assumed 
to be negligible. The assumption is approximately valid at this fre- 
quency for heavy rain. Figures 8 and 9 show that the measured data 
are largely confined within the curves with « = 0.1 and 0.2, and hence 
suggest the geometric mean 0.14 as a good estimate for the median 
value of e. 

Having determined the parameters |@| = 25° and e = 0.14, we can 
use (16) to calculate crosstalk ratios vs attenuation of transmitted 


“If the canting angle distribution is simulated by a gaussian model, then the 
mean will be 2 to 3° and the standard deviation will be 30 to 40°. 
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Fig. 9—Comparison between calculated and measured cross-polarization dis- 
criminations at 30 GHz. 


signal for various frequencies at 100 mm/hr rain rate, as shown in 
Fig. 10. The solid and dashed curves are expected to be median values 
at 100 mm/hr rain rate for XPDH and XPDV, respectively. For a 
given rain fading, the cross-polarization increases with decreasing fre- 
quency. However, 4- and 6-GHz communication systems seldom ex- 
perience rain attenuation of more than a few decibels. Although it 
takes a slightly heavier rain for the vertically polarized signal to suffer 
the same fading as the horizontally polarized signal, XPDV is gen- 
erally less than XPDH for the same fading, except at lower frequen- 
cies such as 4 GHz, where the differential phase shift dominates the 
cross-polarization excitation. 

In addition to explaining the measured cross-polarization at 18 and 
30 GHz, the predicted cross-polarization discriminations in Fig. 10 
also agree fairly well with measured data at 4 GHz of Barnett,’ at 11 
GHz of Watson and Arbabi?* and Evans and Thompson,” at 20 GHz 
of Yamamoto et al.,?8 and at 60 GHz of Delange et al.!® The lack of 
precise agreement stems not only from the imperfection of the theo- 
retical model but also from the measuring error of the experiments. 
Ground reflection and antenna depolarization often limit the cross- 
polarization discrimination of measuring systems to around —365 
dB in clear weather. 
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Measurements show considerable variance of cross-polarization dis- 
crimination at a given rain attenuation. This factor must be kept in 
mind when median values of cross-polarization discriminations at 
given rain fades are used for the design of a dual-polarization radio 
communication system. The worst cross-polarization discrimination 
can be 5 to 10 dB above the median values, whereas it is also possible 
to have very little cross-polarization when almost perfect cancellation 
occurs among the raindrop canting-angle distribution, i.e., « — 0. For 
more precise predictions of radio channel reliability, the joint statistics 
of rain fading and cross-polarization discrimination should be 
considered. 


4.2 Effect of rain rate 


Measured statistics will always be needed for the design of com- 
munication systems. In order to extrapolate statistical results from 
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Fig. 10—Calculated rain-induced cross-polarization of horizontally and vertically 
polarized waves. 
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one measured path to another path of different length, it is essential 
to know the effect of rain rate on depolarization for a given fading. 
Qualitative discussions in Section II already suggest that the depolar- 
ization is relatively insensitive to rain rate. This observation is now 
confirmed in Fig. 11, where XPDH versus frequency are plotted for 
20-dB fading at three rain rates of 100, 25, and 5 mm/hr. 


4.3 XPD of circular polarization 


Equation (20) has been used to calculate the rain-induced cross- 
polarization of circularly polarized waves as shown in Fig. 12. To 
obtain agreement between the calculated curves and the measured 
median values of Semplak® at 18 GHz, the canting-angle reduction 
factor [e7*|? has been empirically determined as 8 dB. On the other 
hand, Saunders’ measured canting-angle distribution yields a reduc- 
tion factor of 6 dB. In view of the variability of the rain storm, the 
above discrepancy does not seem unreasonable. Comparison of Figs. 
10 and 12 shows that the rain-induced depolarization of circularly 


XTDH (XPDH) 
20 dB FADING 
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Fig. 11—Cross-polarization at 20-dB fading for various rain rates. 
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Fig. 12—Calculated rain-induced cross-polarization of circularly polarized waves. 


polarized waves is about 10 dB worse than that of a horizontally 
polarized wave." 


4.4 Oblique propagation 


The above numerical results have been confined to the case of 
a = 7/2, where a is the angle between the direction of propagation 
and the axis of symmetry of oblate raindrops. This case corresponds 
to terrestrial microwave relay systems, whereas other values of @ are 
of interest in satellite systems. Limited data for the latter cases are 
also available from point-matching scattering solutions.!°4 However, 
the following approximate relations exist: 


(An — Axa => (Aq = Aj)r/2 sin? Qa, (21) 
(Pi _ 1) => (Pry = ®1) x /2 sin? a. (22) 


*The depolarization of a signal linearly polarized at 45° from vertical is expected 
‘to be about the same as that of a circularly polarized wave (Ref. 29). 
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Table VIIi — 30-GHz differential attenuation in dB/km 


p in mm/hr a = 90° a = 70° a = 50° 
0.25 0.00313 0.00276(0.00276) * 0.00184(0.00184) 
1.25 0.0261 0.0231(0.02380) 0.0154(0.0153) 
2.5 0.0637 0.0563(0.0562) 0.0376(0.0374) 
5.0 0.150 0.133(0.133) 0.0890(0.0886) 

12.5 0.455 0.403(0.402) 0.270( 0.267) 
25.0 1.00 0.886(0.883) 0.593(0.587) 
50.0 2.14 1.89(1.89) 1.27(1.26) 
100.0 4.29 3.80(3.79) 2.54( 2.52) 
150.0 6.32 5.60( 5.58) 3.74(3.71) 








*'Numbers in parentheses are AA «90° sin? a. 


The above approximation was also suggested by Evans and Trough- 
ton.** A simple derivation in the appendix illustrates the underlying 
assumption. This low-frequency approximation has been tested by 
comparison with 30-GHz point-matching results of a = 70° and 
a = 50°.° Tables VIII and IX show excellent agreement except for the 
differential phase shift of heavy rain rates at a = 50°, where cancella- 
tion between large numbers is involved. 

Making use of eqs. (21) and (22), we can predict the cross-polariza- 
tion discriminations of satellite systems by simply dividing the abscissa 
scale by sin? a in Fig. 10. Here, the incident linear polarization from 
the satellite has been assumed to be parallel or orthogonal to the plane 
containing the direction of propagation and the local gravity direction 
of the ground station. When this assumption is violated in the fringe 
area of an area coverage satellite antenna, higher cross-polarization is 
expected. 


Table IX — 30-GHz differential phase shift in deg/km 


p in mm/hr a = 90° a = 70° a = 50° 
0.25 0.0585 0.0521(0.0517) * 0.0346( 0.0343) 

1.25 0.294 0.263(0.260) 0.175(0.173) 

2.5 0.569 0.510(0.502) 0.339(0.334) 

5.0 1.06 0.952(0.936) 0.632(0.622) 
12.5 2.30 2.06( 2.03) 1.36(1.35) 
25.0 3.77 3.36(3.33) 2.21(2.21) 
50.0 5.56 4.93(4.91) 3.20(3.26) 
100.0 6.59 5.77(5.82) 3.59(3.87) 
150.0 6.43 5.53(5.68) 3.23(3.77) 


* Numbers in parentheses are Adauso° sin? a. 
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APPENDIX 
Approximation for Oblique Propagation 


If the axis of symmetry of each oblate spheroidal raindrop is aligned 
in the same direction and the raindrop radius is small compared with 
wavelength, then a uniform rain can be approximately characterized 
as an anisotropic medium with the relative dielectric constant given 
in the Cartesian coordinates (XYZ): 


€j 0 0 ne 0 0 
0 0 €2 0 0 na 


The governing equation for a plane wave e7’* is simply 
TX (VX #) = PEE. (24) 
In view of the symmetrical property of the medium, the arbitrary direc- 
tion of propagation, 7, may be confined to the XZ-plane without loss 
of generality. Let the angle between the direction of propagation and 
the axis of symmetry (x-axis) be denoted by a. Now we consider two 
polarizations with subscripts I and II designating the electric fields 
parallel and perpendicular to the plane containing the axis of sym- 
metry and the direction of propagation. Polarization II has only one 
electric field component, Ey. Substituting E = E,f into (24) immedi- 
ately yields 
Vii = —k?ni. (25) 


Polarization I has two electric field components, H, and H,. Substitut- 
ing them into eq. (24) gives two equations: 

(nj ae Vi.) Es ai VW 12b, = 0 (26) 

—VN1212H, + (k?ng + Vin), = 0. (27) 


For the above two equations to be compatible, we need the following 


condition: 
kenj + Vi. oa VW12V12 . (28) 
V12V 12 keng + Vie 
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Substituting %y, = Y; cosa and Vy, = Y; sin a into the above equation 
yields 


— p22 2 
7 os k ning (29) 


n? cota + nz sin? a 


We note that 


¥2 = —k*n? when a = 


(30) 


bol 3 


Subtracting (29) from (25), 





—k?n3 (nz — nj) sin? a 


Vir = v3 = 
n2 cosa + nz sin? a 


(31) 


Since 21 LY no Y 1 and Vy LY Wy & Jk, 


Vir — Vr & 7k(ne — 11) sin? a. (32) 
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Design Considerations for a Two-Phase, 
Buried-Channel, Charge- 
Coupled Device 


By J. MCKENNA, N. L. SCHRYER, and R. H. WALDEN 
(Manuscript received March 29, 1974) 


The design of a two-phase, buried-channel (or bulk-channel) charge- 
coupled device is presented. Directionality 1s obtained by using a stepped- 
oxide structure. The basic operation of the device is explained, and the 
effect that changes in various design parameters have on its operation ts 
examined in some detail. A set of roughly optimal parameters are found 
that yield an extremely fast and efficient device. We estimate a charge- 
transfer time of 1.8 ns and a charge capacity of 4.1 X 10" (electrons/cm?). 
Only existing technology is necessary for its fabrication. 


This paper presents some design considerations for a two-phase, 
buried-channel (or bulk-channel) charge-coupled device (BCCD). The 
concept of the BCCD has been presented previously,!? and operation 
of three-phase BCCD’s has been demonstrated.*-* Two-phase surface 
charge-coupled devices (CCD’s) have advantages over three-phase 
surface CCD’s in many applications, and several designs have been 
discussed.’—!2 Therefore, it seems important and timely to consider the 
design of two-phase BCCD’s. 

We present here a brief review of the basic n-channel BCCD struc- 
ture. Figure 1 shows the CCD electrode configuration originally 
proposed for the buried-channel device.! Beneath the charge-transfer 
electrodes are successively a layer of silicon dioxide about 1200 A thick, 
a layer of n-type single-crystal silicon, and finally the substrate of 
lightly doped p-type silicon. By depleting the entire n-region and 
part of the adjacent p-substrate of mobile carriers with the aid of a 
reverse-biased diode at the end of the channel, a potential configuration 
is obtained like the one shown schematically in Fig. 2.1 Here we plot 
the negative of the electrostatic potential, i.e., the potential energy of 
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Fig. 1—Schematic diagram of a three-phase buried-channel CCD. 


an electron. The distinguishing feature is that the potential energy 
minimum is located away from the semiconductor-insulator interface; 
this means that any mobile charge being transferred down the channel 
travels in bulk silicon, and the transfer should be free of losses as- 
sociated with interface states. It also means that the free carrier 
mobility may have a value close to that for bulk. Both these factors 
were expected to increase transfer efficiency relative to surface CCD’s, 
but at the expense of a reduced charge-carrying capability resulting 
from the reduced capacitance associated with the increased separation 
between the metal gates and the channel. 

However, there are necessarily gaps between adjacent electrodes in 
this three-phase device. Since the image charge in the metal plays an 
important role in controlling the channel potential, the finite gaps give 
rise to local potential wells, which store charge between the elec- 
trodes.!3 The amount of charge in each well is not constant ; it depends 
on the values of the clock voltages on neighboring gate electrodes. Thus, 
charge can be exchanged between the signal and the well. This can 
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lead to extremely inefficient transfer.! It has been shown that this 
problem can be eliminated by ensuring that the potential between 
electrodes varies monotonically as a function of distance between 
plates.*-13.14 The gap problem can also be alleviated by using a fabrica- 
tion procedure that reduces the interelectrode gap to zero.!® If we 
have a zero-gap two-phase device, there will be operational and 
fabricational simplifications relative to three-phase devices. 

The stepped-oxide structure illustrated in Fig. 3 not only can be 
operated as a two-phase BCCD, but it also has essentially zero gaps 
between the electrodes." This is the basic configuration studied in 
this paper. Other studies of this configuration have been made.!*—!8 
Techniques now exist for fabricating the device. The n-type layer, 
which has a uniform surface concentration, can be obtained by doing 
an ion implant in the required channel region before the oxide steps 
are defined. The definition of metallization and oxide steps can be 
accomplished by using either the undercut isolation scheme’ or an 
overlapping-gate technology." 

Our purposes here are to investigate the principles of operation of 
the device, to study the effect of varying certain of its design param- 
eters, and to attempt to make a reasonably optimal choice of these 
parameters. 

In Fig. 3, we see that the width of the electrode over the thick- 
oxide step is w; and over the thin-oxide step is w2. The thickness of the 
thick step is d: and of the thin step dz; the permittivity of the oxide 
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Fig. 2—Schematic potential diagram of a buried-channel CCD. 
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Fig. 3—Schematic diagram of a two-phase, stepped-oxide BCCD. 


is €o2. The n-type layer has thickness Y; and permittivity e¢,, and the 
donor density Np is assumed to vary with position as” 


—_ 2 
No(y) = c, exp | = (¥ 7) in +} NE (1) 


where y is distance measured from the top of the thin-oxide step, cz 
is the number density of donor ions at the upper surface of the n-type 
layer, and Ny is the acceptor number density of the p-type substrate. 
Finally, the uniformly doped p-type substrate has thickness Y2 and 
permittivity e,. Of these, the design parameters are di, Yi, and the 
total implanted charge in the n-layer. 

The operation of the device can be qualitatively explained on the 
basis of a simplified one-dimensional model with constant n-layer 
doping Np, which is discussed in the appendix. As shown there, the 
depth of the potential energy well, shown schematically in Fig. 2, 
increases with increasing oxide thickness. This means that the region 
under the thin oxide in Fig. 3 will act as a barrier to charge flow while 
that under the thick oxide will store charge. Interestingly, this is just 
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the opposite of the case for a surface CCD, and consequently the 
direction of transfer in a two-phase BCCD is opposite to that of a 
surface device. The device acts as a BCCD provided the electrode 
voltage does not exceed a limiting value Viim, where 


= Np eNaYi. 
Vii = aye (1 + FP) MA (2) 


If the plate voltage exceeds Viim, then the potential minimum is 
located exactly at the insulator-semiconductor interface. Typically, 
Viim has a value of several hundred volts. 

We wish to choose the design parameter values so that the potential 
well under the thick oxide is deep enough to store as much signal charge 
as possible, and yet the potential barrier between two wells can be over- 
come by applying reasonable potentials to the plates to obtain com- 
plete transfer of this charge. 

To obtain more quantitative information about the device, we turn 
to a two-dimensional calculation. We use a model described in an 
earlier paper” to calculate the electrostatic potential y(«, y) in the 
absence of any mobile charge. For the purpose of the two-dimensional 


| Vv = O(15V) | 
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Fig. 4—Single cell of the model used to calculate curves of Figs. 5 and 6 and 
numbers of Table I. 
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Fig. 5—Channel potential in a two-phase BCCD with all plates eae at 0 or 
15 volts. Parameters are d, = 0.3 um, dz = 0.12 um, Yi = 1.2 ym, Qn = 1.5 K 10” 
em, Na = 5 X 104% cm™. 


potential calculation, the plate widths wi and we are kept fixed at 
10 um throughout the discussion, as is the thin-oxide thickness at 
dz = 0.12 wm. The uniform doping of the n-type substrate is also held 
fixed at Na = 5 X 10% cm-%, and it is assumed that Y. = 50 um, 
which is at least twice the maximum-depletion-region width at any 
voltage considered. These values are similar to those commonly used 
in most MOS technologies. We somewhat arbitrarily put an upper 
limit of 15 volts on the potential difference between electrodes, which 
we are willing to use to transfer charge from one potential well to a 
neighboring one. 

First, we consider the case in which the electrodes are all at the 
same potential, either 0 or 15 volts, and we approximately model the 
device by the configuration in Fig. 4. The potentials and fields were 
calculated for combinations of the following parameter values: thick- 
oxide thicknesses (di) of 0.3 and 0.6 um; n-layer thicknesses (Y1) of 
0.4 and 1.2 wm, and total n-layer doping charges (Q,) of 0.5 X 10” 
em~*, 1.5 X 10" cm7?, and 4.5 X 10" em-. The spatial distribution of 
doping in the n-layer is assumed to be given by (1). It can be shown” 
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that c,, Yi, and Q, are related by 


Q,= ¥1 (= | YCHET erf [Vln (c;/Na)]— Nat , (3) 


where erf (z) is the error function.” Equation (3) was used to calculate 
Cs, given the other parameter values. 

Figures 5 and 6 give plots of the channel potential ¢.(2) as a function 
of distance parallel to the oxide semiconductor interface for two sets of 
parameter values shown in the figure captions. If (x, y) is the elec- 
trostatic potential in the device, then 

g(t) =— max g(x,y). (4) 


dgSySdet¥1 


We summarize our calculations in Table I. Of particular interest is 
the variation of the channel potential with the thick-oxide thickness. 
The variation is defined as the voltage difference between the minimum 
and the maximum, as shown in Figs. 5 and 6. For Y; = 1.2 um, 
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Fig. 6—Channel potential in a two-phase BCCD with all plates either at 0 or 15 


volts. The parameters are d; = 0.6 wm, de = 0.12 wm, Yi = 1.2 um, Qn = 1.5 X 10” 
em, Na = 5 X 10" cm™. 
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Table | 


ri Y; Plate Minimum | Maximum } Minimum Plate 
“ - 10-"Q, Voltage Channel | Channel Channel Voltage 

a afm Clie At Sacre Potential | Potential Depth Limit 
# B in V in V in pm Viim 
0.6* 0.4 1.5 0 — 25.28 —7.56 0.106 343 
0.6* 0.4 1.5 15 — 38.23 — 22.02 0.099 343 
0.3 0.4 1.5 0 —15.62 —7.93 0.140 343 
0.3 0.4 1.5 15 — 28.90 — 21.87 0.124 343 
0.6* | 1.2 1.5 0 —27.27 —10.78 0.381 352 
0.6* 1.2 1.5 15 —39.83 —24.51 0.3840 352 
0.3 12 1.5 0 —17.89 —10.61 0.398 352 
0.3 1.2 1.5 15 — 30.88 — 24.23 0.398 352 
0.6" 0.4 0.5 0 —6.78 —2.35 0.096 39 
0.6* 0.4 0.5 15 —18.58 —16.01 0.050 39 
0.3 0.4 0.5 0 —4.42 —2.34 0.125 39 
0.3 0.4 0.5 15 —17.12 —15.99 0.058 39 
0.6* 1.2 4.5* 0 —91.3 — 35.63 0.398 3088 
0.6* 1.2 4.5* 15 —104.74 —49.90 0.390 3088 
0.3 1.2 4.5* 0 —58.57 — 34.45 0.398 3088 
0.3 1.2 4.5* 15 —72,25 —48.58 0.398 3088 


* Unacceptable values. 


Q, = 1.5 X 10" cm’, and a plate voltage of 15 volts, the values of 
the variation are 6.65, 11.75, and 15.32 volts, corresponding respec- 
tively to d; values of 0.3, 0.45, and 0.6 um. 

Note that the physical depth of the channel (the distance of the 
potential minimum below the oxide interface) is less by as much as a 
factor of 2 when the doping profile is given by (1) than when it is 
constant, equal to the average doping, which has been pointed out 
elsewhere.” 

Although actual operation of the device involves having different 
voltages on successive electrodes, a first screening of the possible 
parameter values can be made on the basis of the calculations de- 
scribed in the preceding paragraphs. A criterion for total charge 
transfer from under the plate at 0 volt to the plate at 15 volts is that 
the minimum channel potential under the 0-volt plate be greater than 
the maximum channel potential under the plate at 15 volts, i.e., the 
barrier potential in the receiving region should be less than that of the 
potential well in the sending region. Table I shows that all cases of 
di = 0.6 um or Q, = 4.5 X 10” em violate this condition. The cases 
of 2, = 4.5 X 10% cm-? and Y,; = 0.4 um are not shown because they 
would also violate this condition. These parameter choices were re- 
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jected. The parameter value Q, = 0.5 X 10! cm~ was also rejected 
because the minimum channel depth is small. 

There remain the parameter values d; = 0.8 umand@Q, = 1.5 X 10” 
em~?, and either Y1 = 0.4 um or Y = 1.2 um. Since there seems to be 
little difference between these two cases on the basis of the calculations 
so far, we also consider the case in which d; and Q, are as stated above 
and Y; = 0.8 ym. We now examine the device in which one plate is 
at O volt and the adjacent one at 15 volts. The device was modeled 
by the configuration of Fig. 7, and the calculations are again based on 
the model of Ref. 18, in which there is no mobile charge. In all the 
calculations to be discussed now, we took d; = 0.3 um, dz = 0.12 um, 
Y, = 0.4, 0.8, or 1.2 um, Qn = 1.5 X 10" cem-?, Na = 5 X 10" cm-3, 
and Np(y) given by (1). Figure 8 plots some results for one cell of 
such a BCCD for the case Y1 = 0.8 um; ¢, is the channel potential, 
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Fig. 7—Two cells of the model used to calculate the curve of Fig. 8. Assumed 
potential variation along the second level of metallization is shown below. 
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Fig. 8—Channel potential ¢, and potential at the semiconductor oxide interface 
¢s in two cells of a two-phase BCCD. There is no inserted charge, plate potentials 
are as shown, and parameters are di = 0.3 wm, d2 =0.12 um, Y: = 0.8 yum, 
Q, = 1.5 X 10" cem-*, Na = 5 X 10" cm=?. The dashed curve shows the position 
of the channel below the oxide-semiconductor interface. ¢. and ¢; at the potential 
minimum under the receiving plate are also indicated when the device is full of charge. 


¢s 18 the potential at the oxide-semiconductor interface, and the 
dotted curve is the position of the potential minimum below the 
oxide-semiconductor interface. 

The amount of charge that can be carried in this BCCD was esti- 
mated using a one-dimensional analysis in the well. Charge was 
added to the one-dimensional well until the minimum potential in the 
well just equalled the barrier potential ; that is, the potential under the 
thin-oxide step part of the 15-volt plate of Fig. 8. The values obtained 
(38-5 X 10" cm?) indicate that practical quantities of charge can 
be handled by the BCCD. The method of this calculation” is similar 
to one carried out by Kent.*4 

It is of interest to consider the relative values of surface potential 
and channel potential for empty and full wells. Figure 8 shows the 
results of the two-dimensional calculation for both potentials with no 
free charge; a potential difference of approximately 1.75 volts is 
maintained along the channel in the receiving well. As the well is 
filled with charge, this difference is reduced to 0.825 volt, as is indi- 
cated in the diagram. These last data were obtained with the aid of 
the one-dimensional calculation described above.” The 0.825-volt 
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differential ensures that the carrier concentration at the silicon- 
silicon-dioxide interface will be a negligible fraction of that in the 
channel which, in turn, indicates that device performance will be 
essentially unhindered by surface effects. 

Table II contains a list of charge-carrying capacities and fringing 
field values as a function of Y,. Notice that the capacity falls off 
relatively slowly with increasing Y1, while the fringing fields increase 
at a somewhat more rapid rate. Two columns give field strengths; the 
left-hand column refers to the minimum horizontal field in the channel 
under the “‘sending’”’ well, and the right-hand column refers to that 
under the “receiving” barrier. Notice that charge transport will be 
mainly limited by the fields under the latter. The situation would 
reverse if the maximum clock voltage were increased somewhat 
beyond the 15 volts used here. It is shown below, however, that a field 
strength of 710 V/cm is sufficient to ensure extremely rapid charge 
transfer. The data in Table II show that the ultimate choice of Y1 
is one involving a tradeoff between capacity and fringing field and 
would depend on the particular device requirements. 

Both from the simple model in the appendix and from our two- 
dimensional calculations, we estimate that, for our choice of parameter 
values, the electric field at the semiconductor surface never exceeds 
1.8 X 10° V/em and at the p—-n junction never exceeds 105 V/cm. 
These fields are below the avalanche breakdown fields for these 
conditions (8-4 X 105 V/cm). It can be shown that the field at the 
semiconductor surface increases with increasing Qn, so if Q, is too 
large, this field will exceed the avalanche breakdown field. In fact, our 
calculations show that in the case Q, = 4.5 X 10%cem~, Y; = 1.2 um, 
which we rejected for other reasons, the surface field is about 
5.8 X 105 V/cm, which indeed exceeds the avalanche breakdown field 
for that case. 

Finally, we estimated the speed with which the device of Fig. 8 can 
transfer charge from one well to the neighboring well. A technique of 








Table Il 
. Fringe Field Fringe Field 
Y; in pm a 2 pen Under Well Under Barrier 
in V/cm in V/cm 
0.4 4.8 * 104 1395. 482. 
0.8 4.1 X 104 1755. 710. 
1.2 3.4 X 10" 1955. 845. 
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Fig. 9—Plot of logio r(¢) as a function of t for the BCCD of Fig. 8. 


Strain and Schryer?® has been adapted to cases such as the present 
one. Initially, we assumed the plate voltages were the opposite of 
those shown in Fig. 7, and the charge was all stored in the left-hand 
well (5u S « S 15yn). Then att = 0, the voltages were instantaneously 
reversed to the configuration shown in Fig. 7, and the charge flowed 
from the left-hand well to the right-hand well (25 u S x S$ 35). The 
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calculation®® based on a one-dimensional analysis of the charge flow 
in the channel showed that if the well initially contained 10° electrons, 
then essentially all the charge transfers in 1.8 ns (see Fig. 9). Let Q,(2) 
denote the total charge in the left-hand well at time ¢, and define the 
transfer ratio v(t) by 

z(t) = Q,(t)/Q,(0). (5) 


Figure 9 plots logio r(é) as a function of ¢. Figure 10 plots the charge 
density in the channel (in dimensionless units) as a function of position 
for t = 0, 0.18 ns, and 2.56 ns. By referring to Fig. 8, it is seen that the 
two deep depressions in the curve for ¢ = 0.18 ns are due to the very 
strong field-aided transfer at those points. Note in Table II that the 
minimum field under the receiving barrier is 710 V/cm, while the 
minimum field under the sending well is 1755 V/cm. This accounts for 
the bunching effect at ¢ = 0.18 ns shown in Fig. 10. This bunching 
effect can be reduced by increasing the most positive electrode 
potential. 

By taking advantage of the capabilities of either self-aligned gate 
technology or undercut isolation schemes and of ion implantation 
technologies, the preceding paragraphs have shown the design param- 
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Fig. 10—Charge distribution (in dimensionless units) along the channel for three 
different times. 
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eters required in the fabrication of an extremely fast and efficient 
two-phase, buried-channel, charge-coupled device. This device should 
have the advantages of convenient operation because of two-phase 
operation and high transfer efficiency because the buried channel 
eliminates surface trapping and surface scattering of the transferring 
carriers and introduces strong fringing fields. Further, by careful design, 
the charge capacity of this device, while lower, can be competitive 
with surface devices. 


APPENDIX 


This appendix briefly derives some results using a simplified, one- 
dimensional model of a BCCD and the well-known depletion-layer 
approximation.”° The oxide layer has thickness d and permittivity €oz. 
The n-type layer has a thickness Yi, permittivity e,, and is uniformly 
doped with donor density Np. The p-type substrate is assumed to be 
infinitely thick, with permittivity «, and acceptor density Nu. The 
electrostatic potential is denoted by ¢(). 

We introduce dimensionless quantities as follows. All distances are 
measured in terms of Debye lengths Ap, 


Av = (ekT/eN 4)?, (6) 


where k is Boltzmann’s contant, T is the absolute temperature, and e 
is the magnitude of the electronic charge. Then we define 


2. y/Xp, h= d/Xp, a= Y1/Xp. (7) 


In addition, we define the dimensionless electrostatic and electrode 
potentials 


v(z) = eg(y)/kT, Vo = eVe/kT, (8) 
and the dimensionless ratios 
7 = €0z/€s, c= Nod/Na. (9) 


Then y(y) is the solution of the equations 


v''(z) = 0, OS2zSh, (10a) 
WV") =-0, hSzSht+a, (10b) 
v"(z) = 1, h+easSzSht+atR, (10c) 

v(z) =0, htatR<z, (10d) 
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which satisfies the electrostatic boundary conditions 


¥(0) = Vo, (11a) 

v(h—) = ¥(h+), m’(h—) = ¥'(h+), (11b) 
VAta-)=vhAtat) Wahta-)=Whtat), (Le) 
WAtatk)=Whtatr) =0. (11d) 


The thickness of the depletion layer, R, is an unknown to be deter- 
mined from the boundary conditions. The solution can be determined 
easily. 


¥(2) = Vo+ (oa — R)-, O<S2Sh, 
=—$(lt+o)(e-—h-a)?+3(e-—h—-—2a — RB), 
hszSht+a, 
=3(¢-h-a—R), ht+tasSzsht+atR, (12) 
where 


R=-(2+a)+ Jato (t+a)-o(2) +27 (13) 


The electrostatic field is obtained from (12) by differentiation. 
To determine the position, ym, of the electrostatic potential maxi- 
mum, we first set ’(2m) = Oinh <z<h-+ 2, and obtain 


im = h-+* (21 — B) =hta— (14) 


Thus the position of this maximum occurs in h < z<h-+ 21 if and 
only if ozi — R > 0. If cz1 — R S 0, it is easy to show that y(z) S Vo, 
and the device would operate as a surface CCD, since the potential 
maximum in the semiconductor would be at the oxide-semiconductor 
interface. It is easy to show that oz1 — R > 0 if and only if 


Vo < go0(1 + o)at. (15) 


When it is written in terms of dimensional quantities, we obtain 
inequality (2). 

Inequality (15) is a rough criterion that places an upper limit on 
the plate voltages that may be used in a BCCD. 

Assuming (15) is satisfied, the value of the electrostatic potential 
maximum is 


Ven) = 5 (1+ 5) & (16) 
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It is straightforward to show that 


aR _ o21—-R . 
dh h+n(R +2) 


Thus, as long as oz: — R > 0, dR/dh > 0. Consequently, from (16) 
dy (2m)/dh > 0 as long as (15) is satisfied. In other words, for electrode 
voltages within the operating range of a BCCD, the value of the 
electrostatic potential maximum is greater under the thick-oxide step 
than it is under the thin-oxide step. This is just the opposite of the 
case in a surface, stepped-oxide CCD. From (14) it follows that 


(17) 


d _ 1dR 
ay, em — hy) Se Aram NRT 7 (18) 


Thus, within the operating range, the position of the electrostatic 
maximum is closer to the oxide-semiconductor interface under the 
thick-oxide step than it is under the thin-oxide step. 
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A dielectric slab can keep optical beams confined transversely in its 
plane if it is tapered, with the slab thickness having a maximum along some 
straight line. When the square of the local wave number of the slab (k?) ts 
a quadratic function of the transverse coordinate (y), the rays in the plane 
of the slab are sinusoids whose optical length is almost independent of the 
amplitude. For thin slabs (2d « ) as well as for thick slabs (2d >), 
pulse spreading ts large because the ratio of the local phase to group velocity 
is strongly dependent on the distance (y) from axis. We show that pulse 
spreading is almost negligible, however, if the thickness of the slab is 
properly chosen. For example, tf the slab thickness on axis ts 2.5 microm- 
eters and the refractive index of the slab 1s 1 percent higher than that of the 
surrounding medium, pulse spreading is only 0.05 nanosecond per kilome- 
ter at a wavelength of 1 micrometer. Pulses in clad fibers having the same 
width (0.2 millimeter) and carrying the same number of modes (185) 
spread 50 times faster. Splicing and matching to injection lasers may be 
easter with planar fibers than with conventional fibers. Low-dispersion 
planar fibers are therefore attractive when used in conjunction with sources 
that are multimoded in one dimension. Closed-form expressions are given 
for square-law and linear-law profiles. 


Il. INTRODUCTION 


This introduction gives first a brief review of the general concepts 
of pulse transmission in multimode waveguides,!? and subsequently 
considers the case of planar structures that ensure transverse confine- 
ment of the optical beams. 

The most important parameters of optical fibers for communication 
are loss (perhaps a few decibels per kilometer) and pulse spreading 
(perhaps a few tens of nanoseconds per kilometer). Given these two 
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parameters, the maximum repeater spacing and the transmission 
capacity of the fiber are pretty much determined, considering the 
limitations that presently exist in source power (L.E.D. or injection 
lasers) and detector sensitivity (avalanche photodiodes). If the loss 
is the limiting factor, a reduction in bandwidth allows an increase in 
repeater spacing because of the increased receiver sensitivity, but only 
by a modest distance. Inversely, baseband equalization allows the 
transmission capacity to be increased at the expense of optical power, 
but not by a very large factor. In this paper, we consider only the 
problem of pulse spreading. 

Consider first a single-mode waveguide; for instance, a rectan- 
gular waveguide whose width is less than a wavelength. The wave 
number 6 may be a rapidly varying function of w, particularly near 
cut-off. The transit time of a pulse of radiation is equal to the ratio 
LdB/dw of the path length L and the group velocity dw/d8. Because a 
pulse of small duration has a broad frequency spectrum, some com- 
ponents arrive ahead of the others if d8/dw varies with w; that is, if 
a?8/dw* ~ 0. The pulse duration, 7, is of the order of (Ld?6/dw*)*. If 
the waveguide is filled with a material having dispersion, the phe- 
nomenon remains essentially the same. Single-mode pulse spreading 
is small at optical frequencies when the carrier is almost monochro- 
matic (e.g., injection lasers) because, for a given kind of waveguide, 
single-mode pulse spreading is inversely proportional to the square 
root of the frequency; that is, it is 100 times smaller at optical fre- 
quencies than at microwave frequencies. This effect can therefore be 
neglected. 

A quite different mechanism for pulse spreading is found in multi- 
mode waveguides (with modes of the order m = 0, 1, 2, ---) excited 
by multimode sources. In most waveguides, different modes have 
different group velocities. Thus, a pulse decomposes into a train of 
pulses, one for each mode, having times of arrival Ld8,,/dw, m = 0, 
1,2, ---. This effect has similarities with the multipath effects ob- 
served in open space. Multimode pulse spreading is observed even 
when a single mode is excited because, soon after, the power is trans- 
ferred to other modes and back to the first mode, as a result of the 
irregularities of the fiber or of the bends (see Ref. 2 and references 
therein). In this paper, we assume that the fiber is perfectly straight 
and uniform, and investigate ways of minimizing the dependence of 
AB m/dw on m. 

To appreciate the magnitude of the problem, let us consider first a 
nondispersive homogeneous dielectric slab with refractive index n close 
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to unity. By comparing the length of rays at the critical angle (6,) 
to the length of axial rays, we find that the pulse spreading is AT 
= (L/c)[ (cos 6,.)-! — 1] & (L/c)(n — 1). This pulse spreading can be 
written as a function of M, the number of modes that we want to trans- 
mit (a characteristic of the source used) and of the slab width Y: AT 
= 400 M?(\/Y)? ns/km. For example, if we want to transmit 20 modes 
and Y = 70), pulse spreading is 33 ns/km, a value that seriously limits 
the transmission capacity for long-distance applications. The guide 
width Y cannot be increased very much because the bending losses 
would rapidly increase and because it is difficult to fabricate clad fibers 
with very small differences in refractive index. 

The difficulty is solved in principle if the permittivity ¢« of the 
medium varies as the square of the transverse coordinate y: e(y) 
= ] — y’. In a square-law medium, the optical length of the rays is 
almost independent of their amplitude. If the permittivity has the 
form «(y) = (cosh y)?& 1 — y? + 3y!+ ---, rays have in fact all 
exactly the same optical length.!*~” Because most glasses have negligible 
dispersion, such media exhibit very small pulse spreading.* Multimode 
square-law fibers are certainly attractive. However, it may prove 
difficult to obtain with sufficient accuracy the desired variation of 
permittivity. Furthermore, the losses (impurities and scattering) are 
usually higher for heterogeneous material than for homogeneous mate- 
rial such as fused quartz. It is therefore interesting to investigate 
whether a dimensional change can replace the continuous change in 
the refractive index considered above. 

A proposal to that effect was first made by Kawakami and Nishi- 
zawa.! They have shown that optical beams can be confined trans- 
versely in the plane of a slab if the slab thickness has a maximum along 
some straight line (z-axis). This can be understood from a geometrical 
optics point of view. The slab thickness can be considered a constant 
over a small interval of the transverse coordinate y. Various modes can 
propagate in this uniform slab. Let k denote the wave vector of one of 
them, e.g., the H: mode. Because of isotropy, the magnitude k of k is 
the same in all directions. Once the local properties of the waveguide 
characterized by the wave number k(y) have been obtained, the prop- 
agation of optical beams can be found, in the semiclassical approxima- 
tion. We need deal only with k(y). For instance, if k?(y) is a quadratic 
function of y, e.g., k?(y) = k.2 — 0?y?, the rays are sinusoids and they 
have almost all the same optical length. Diffraction effects in the 


*The properties of graded-index fibers that depart somewhat from a quadratic 
law have also been investigated (Refs. 8 to 10). 
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yz plane can be taken into account, to some extent, as the Hamil- 
tonian theory of beam modes shows." For the quadratic variation 
considered above, for example, the modes of propagation are Hermite- 
gauss,!! regardless of the physical origin of the variation of k with 
y (that is, whether the variation of k with y results from a genuine 
variation in refractive index or from a change in slab thickness). Be- 
cause we are interested in highly multimoded fibers, we consider only 
the geometrical optics field. In that approximation, a mode is repre- 
sented by a manifold of rays y(z + ¢),0 < ¢ < Z, where Z denotes the 
ray period. The main result of this representation is that the axial 
propagation constant (k,) of the guide is the value assumed by k at 
the turning point y = & of the trajectory. Therefore, we need only 
solve a ray equation. 

The preceding discussion is applicable to the propagation of waves 
at one angular frequency, w.. To obtain information concerning the 
propagation of optical pulses, we need to know, not only k(y), but also 
the variation of the local group velocity u with y. If the ratio (w,/k) 
-(dk/dw) of the local phase velocity (v = w,/k) and group velocity 
(u = dw/dk) happens to be independent of the y coordinate, the time 
of flight of a pulse along a ray trajectory is proportional to the optical 
length of that ray. In that case (but only in that case), equal optical 
lengths imply equal times of flight. The above condition (v/u inde- 
pendent of y) is rather well satisfied for most materials with low dis- 
persion, such as fused quartz, whose refractive index is changed slightly 
by such processes as ion implantation. (For normal quartz n = 1.4564, 
dn/d\ = —0.27 X 10-5 at \ = 0.6563 um.) In cases where there is a 
physical change in the refractive index, it is sufficient to consider the 
optical lengths of rays with different amplitudes to obtain with good 
approximation the value of the pulse spreading. For a homogeneous 
dielectric slab, however, the ratio of the local phase to group velocities 
is strongly dependent on the slab thickness (2d), and therefore on y, 
when either 2d > \* or when 2d <x. (The latter approximation is 
made in Ref. 1; pulse spreading for tapered slabs is not discussed in 
Ref. 1). We will show that small pulse spreading is obtained only for a 
precise value of the slab thickness on axis. For simplicity, we have con- 
sidered only quadratic and linear dependences of k? on y. The optimum 
profile may be different, however. In Section II we give the essential 
formulas for the ray trajectories and times of flight in structures with 


“We are indebted to E. A. J. Marcatili for pointing out that pulse spreading in 
thick, quadratically tapered slabs is almost as large as in clad slabs. This observation, 
at first surprising, stimulated our interest in the problem. 
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known local phase and group velocities. In Section III we consider in 
detail the case of tapered slabs and given design values for low pulse 
spreading. General results are given in Appendix A, and analytic 
solutions for square-law and linear-law tapers are given in Appendix B. 


ll. GENERAL RESULTS 


The local value & of the wave number of a slab mode is given in 
Section III. In the present section we assume that the local wave num- 
ber k = wv and the inverse 0k/dw of the local group velocity wu are 
known functions of y at the operating angular frequency (w). We give 
the general form of the ray equations and the time of flight of a pulse 
in a mode m, in the geometrical optics (J.W.K.B.) approximation. 
The derivations are given in Appendix A. 

In a medium that is isotropic, time-invariant, and independent of 
the axial coordinate (z), that is, in a uniform fiber, the ray equations 
y(z) are most convenient in the form 


ki = kK? (w, y) — ke (1a) 
dy/dz = —0k./dk, = k,/k., (1b) 
dk,/dz = 0k./dy = 4(dk?/dy)/k., (1c) 


dt/dz = dk./dw = 3(0k?/dw)/kz. (1d) 


Because of the ¢ and z invariance of the medium, w and k; are constant 
along any given ray (constants of motion). The x coordinate is ignored. 
The first equation, (la), says that, because of local isotropy, k2 + k? 
is equal to k?. In (1b) to (1d), k. is considered a function of k,, w, and 
y. Equations 1(b) and (1c) are the ray equations. They give the in- 
crements in ray position (dy) and momentum* (dk,) for an increment 
dz of z. As indicated before, k, characterizes a ray trajectory, that is, 
it is different from one ray to another, but it remains the same along 
any given ray. We can eliminate k, from (1b) and (1c) by differentia- 
tion. We obtain 


Il 


Py /dz? = 3(dk?/dy)/ke. (2) 


We first select, as an initial condition, the angle @, that the ray makes 
with the z axis at the origin of the coordinate system (y = z = 0). We 
then evaluate the constant of motion k, from 


k, = k(0) cos &. (3) 


*The ray momentum is the transverse component of the wave vector. Ray 
momenta and photon momenta (/k) are essentially equivalent concepts. 
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The ray trajectory y(z) is obtained step by step from (la) and (1b), 
Yr = yi + Te (ys)/kz — 1}tdz, (4) 


Az being the increment in z, and yp = 0. Note that, because of sym- 
metry, it is sufficient to evaluate y(z) from y = 0 to the turning point 
y = & with é— given by k(&) = ky. 

To any given value of 6, (or kz) we can associate a mode number m. 
The mode number is the area enclosed in phase space (k,, y) by a ray 
trajectory, divided by 27 minus 4 (see Appendix A). Thus, if the 
integration is stopped at the turning point y = & (one-fourth of the 
ray trajectory), we have 


ROS i eros: (5) 


Strictly speaking, only those values 8m of 8. should be considered that 
make m an integer in (5). However, because we are interested in modes 
of high order, m can be considered a continuous parameter. An approxi- 
mate value for m is 76,£/A, where \ denotes the wavelength on axis 
[k(O) = 2/d]. 

The time of flight 7 of a pulse is, for a unit length, the inverse 1/v, 
of the axial group velocity. We show in Appendix A that T is obtained 
most easily by integrating along a ray ds/u, where ds = (k/k,)dy 
= (k/k.)dz denotes the elementary ray arc length, and 1/u = 0k/dw 
the inverse of the local group velocity. Thus, 


T = 7-1 g ds/u = (2/Z) i. * (ak?/ aw) (k® — k2)—tdy. (6) 


Near the turning point (k = k,), the integrand in (6) is singular. It is 
therefore preferable from a computational point of view to set ds 
= (k/k,)dz and integrate over z rather than over y. We have [also 
directly from (id) ] 


T = (2/Zk;) if *" (ak / du) da. (7) 


The purpose of this paper is to find ways to minimize the variation 
AT of T for0 < m < M, where m is given in (5) and M is the number 
of modes that we want to transmit. It is interesting to compare this 
variation to the variation A7', for a clad fiber having the same width 
Y = 2é and the same number of modes M. The latter is, as we have 
seen in the introduction, 


AT. = (1/32) M?(X/£)?c. (8) 
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Thus, we want to maximize a quality factor Q defined as 
Q = AT./AT = (1/32)M?(\/£)2/cAT. (9) 


Note that, since AT and AT, are times of flight for unit lengths, they 
have the dimensions of inverses of velocities. For given k(y) and 
(dk/ dw) (y), integration of (4), (5), and (7) gives Q(@.) in (9). As 6, is 
increased, Q increases and reaches a maximum Qmax, Which charac- 
terizes the pulse spreading properties of an optical waveguide for a 
given profile. The best profile is the one that maximizes Qmax, provided 
other specifications (number of modes, channel width, ---) are met. 


Ill. TAPERED DIELECTRIC SLABS 


Let us now consider the tapered dielectric slab shown in Fig. 1b. We 
consider only the H; mode of the slab. A similar discussion would be 
applicable to the #; mode (and to higher-order modes if the slab is 
thick enough to support them). Of course, a profile that is optimum for 
the Hi mode need not be optimum for the £1 mode, for example, unless 
e = n* is very close to unity. Let us first give expressions applicable to 
slabs with constant thicknesses. We assume that the medium is the 
same on both sides of the slab. (For dissymmetrical media, the formu- 
las in Ref. 13 would be helpful.) 

The dispersion equation k(w) for H: modes in a slab with relative 
permittivity « and thickness 2d is, as is well known, 


(kd)? — (2a) = ¢? tan? ¢, (10a) 


w=e(s a) — (kd)?. (10b) 


From (10) we obtain at w/c = 27 (that is, \ = 1 wm, using the ym 
as the unit of length), by straightforward substitutions and differ- 
entiations, 


d = (1/27) (e — 1)~*¢/cos ¢, (11a) 
k? = (27)?[1 + (e — 1) sin? ¢], (11b) 

_ cok’ _ 2r(ep tan ¢ + esin® d + cos? gd) | 
i 200 (o tan @ + 1) (11¢) 


Thus, the quantities k? and 40k?/dw that enter in our previous expres- 
sions are explicit functions of the parameter ¢, related to d by (11a). 
The parameter ¢ varies from 7/2 ford = & to 0 ford = 0. The varia- 
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€ = €g/cosh2 (Y/Yq) 


RAY 








Fig. 1—Planar fibers. (a) Fiber with constant thickness and variation of the 
permittivity of the form 1/cosh? (y). (b) Tapered dielectric slab. The field i shown 
for the Hi slab mode. (c) Coupling between the various slab modes (Hi, Hz, - 
Ei, Es, ---) cannot be neglected when the thickness 2d(y) varies abruptly. This 
ae so eliminate the higher-order modes (H:2, E2, ---) for suitable dimensions 

see Ref. 12 


tion of 





i= (3 )s% - ed? tan @ + esin? ¢ + cos? ¢ (12) 


u #)2 0a [1+ (e— 1) sin? ¢]($ tan ¢ + 1) 


a= 
is plotted in Fig. 2 as a function of ¢ for various values of e. For quad- 
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ratic k?(y), the optimum value ¢ on axis if close to the maximum of 
the curves, shown by a dotted line, because, near this maximum, times 
of flight are proportional to optical lengths (see Section II). Thus, we 
have for that case a rule for the selection of the slab thickness on axis, 
2d, = 2d(0). The optimum value of d, may be slightly different, how- 
ever, than the one given by the maximum of the curves in Fig. 2, be- 
cause we want to minimize the variations of T' over a finite range of m. 

Instead of specifying the slab profile d(y) or the square of the wave 
number law k?(y), we find it convenient, for the ease of computa- 
tions, to specify ¢(y). If ¢ is quadratic in y, both k?(y) and d(y) are 
quadratic in y to first order. Thus, we set 


= o — Ky, (13) 


where K denotes a constant, in (11), and substitute in the ray equa- 
tions, (1b) and (1c), eq. (5) for m and eq. (7) for T. 

The variation of the time of flight as a function of the angle 6, that 
the ray makes with the z axis at the origin is shown in Fig. 3 for ¢ = 1.5 
to 0.2 and n = 1.45, \ = 1 um. Large pulse spreading is observed 


0.1 





> ook 
> 
0.001 
0.0001 
0.00001 
0 0.2 0.4 06 0.8 1.0 1.2 1.4 1.6 
0) 


Fig. 2—Variation of the ratio of phase to group velocity in a dielectric slab for 
different relative permittivities, as a function of the characteristic angle ¢. The 
optimum points of operation for low pulse spreading in square-law tapered slabs are 
shown by a dashed line (X = 1 um). 
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Fig. 3—Ratio of vacuum to axial group velocities (c/v,) as a function of the ray 
angle (6,) at the origin, for a tapered dielectric slab with n = 1.45 and a quadratic 
variation of the characteristic angled: ¢ = ¢ — 4 X 10-5y?, for various values of ¢5. 
Group delay is related to c/vg by T = (10!/3)c/v, ns/km. The characteristic angle 
on axis ¢, = 0.65 is seen to give a small variation of c/v, over a large range of values 
of 6. (A = 1 pm). 


when the slab is very thick on axis (¢, = 1.5) or very thin (¢, = 0.2). 
Optimum values are between 0.6 and 0.7. Detailed results will be given 
for the case n = 1.01 (refractive index of the slab is 1 percent higher 
than that of the surrounding medium), which seems of greater practical 
importance. 

For n = 1.01, \ = lum, and ¢ = ¢, — 10-*y?, we see in Fig. 4 
that the tapered slab can be 50 times superior to the equivalent clad 
fiber (factor Q). The profile of this fiber is shown in Fig. 6 (curve a), 
the thickness on axis being equal to 2.5 um. The results for the case of 
a linear law ¢ = ¢, — 5 X 10-*|y| are shown in Fig. 5 and the cor- 
responding profile in Fig. 6 (curve 6b). For both quadratic and linear 
laws, we note that a trade-off has to be made between the quality 
factor Q and the mode number M. (Note that the results are meaning- 
ful only when J is large compared with unity.) 

In conclusion, tapered dielectric slabs can exhibit very low pulse 
spreading if properly dimensioned. If the slab material has a refractive 
index 1 percent higher than that of the surrounding medium, the thick- 
ness should be of the order of 2.5 + 0.2 um at a wavelength of 1 um. 
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The waveguide width would be in that case of the order of 0.2 mm. 
Pulse spreading does not exceed 0.05 ns/km for 15 modes. These opti- 
cal waveguides are attractive because they can be stacked for multi- 
channel operation (a possible arrangement is shown in Fig. 7) and 
splicing would perhaps be easier than with conventional fibers (a good 
angular alignment, however, is required for planar fibers). Further 
technological researches are needed to settle this point. 
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APPENDIX A 
Times of Flight in the J.W.K.B. Approximation 
The purpose of this appendix is to derive the ray equations and the 


time-of-flight equations from general principles. We start from the 
Hamilton equations in space-time both for conceptual clarity and to 





(nsec /km) | (um) 


0.775 0.780 0.785 0.790 
éo IN RADIANS 


Fig. 4—Variation with the characteristic angle on axis ¢ (or slab thickness on 
axis 2d.) of the quality factor Q (defined as the ratio of pulse spreading for an equiva- 
lent clad fiber AT. to the actual pulse spreading AT) for « = 1.02 (n = 1.01) and 
¢ = do — 10-*y*. — denotes the maximum ray excursion, M the total number of 
modes, and AT’ the pulse spreading. The ray period is 14 mm and @, is equal to 2.6° 
for do = 0.785 (A = 1 um). 
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(nsec /km) | um 





0.97 0.98 0.99 1.00 1.01 1.02 1.03 
do IN RADIANS 


Fig. 5—Variation with the characteristic angle on axis ¢ (or slab thickness on 
axis 2d.) of the quality factor Q for « = 1.02 (n = 1.01) and¢ = ¢@ — 5 X 10 |y|. 
The variations of £, M, and AT are also given. The ray period is 5.6 mm and @ is 
equal to 4° for ¢. = 1. 


facilitate generalizations to anisotropic or time varying media (which 
are not discussed in detail in the main text, but are of potential 
interest). 

A general medium is described by a function of w, k, t, x 


H(w, k, t,x) = 0. (14a) 


The space-time trajectories (world lines) of particles or wave packets, 
[t(c), x(c) ] or x(t), are obtained by integrating the Hamilton equations 
dt/do = —0H/de, 

dx/do = 0H/dk, 

du/do = dH /dt, 
dk/ds = — 0H/odx, 


(14b) 


I 


where o denotes an arbitrary parameter.* These equations are in a suit- 


“If we define X = {x, ict}, K = {k, iw/c}, the Hamilton equations (14b) are: 
dX/do = @H/d0K and dK/do = —0dH/0X. The latter follows from the first (see 
Ref. 11) because H = 0 and K = VS. The dynamical significance of the Hamilton 
equations follows from the expression of the canonical stress-energy tensor (Ref. 14): 
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2d (um) 


0 25 50 75 100 125 150 
ly] (um) 


Fig. 6—Slab profiles for n = 1.01. (a) Quadratic case ¢ = 0.785 — 107*y?. (b) 
Linear case ¢ = 1 — 5 X 103 |y|. 


able form for numerical integration. The initial conditions must, of 
course, be consistent with (14a). Then (14a) remains satisfied at all 
a because, from (14b), dH/do = 0. 

For time-invariant media, the form 


w = w(k, x) (15) 


is more useful. The motion x(t) of a wave packet is a solution of the 
Hamilton equations 


dx/dt = dw/ dk, 


dk/dt = —dw/dx. (16) 


If we are interested only in ray trajectories at some fixed w, we can 
rewrite (15) 


h(k, x) = 0, (17a) 





T = Ko&/dK, where £ denotes the averaged Lagrangian density. d£/dK is the 
(conserved) wave action, and T is conserved in time-invariant homogeneous media. 
The equality of group and energy velocities readily follows from this expression for 
T. Note that these results are applicable to any linear wave (e.g., matter waves, 
acoustical waves, or optical waves). 
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(10 CHANNELS) 


Fig. 7—Stacked tapered dielectric slabs. Adjacent slabs are separated by slabs 
with inverted slope and high index to minimize crosstalk caused by scattering. 
and obtain the rays from 


dx/do = dh/ dk, 
dk/do = —dh/0x, 


(17b) 


where o is again an arbitrary parameter. Equation (17) is the reduction 
of (14) to three dimensions. Note that the Fermat principle (in three 
dimensions) is applicable to rays x(c) at a constant frequency w. It is 
unrelated to the time of flight of wave packets, except for nondispersive 
media. It is important for our study that the time of flight of a pulse 
be carefully distinguished from the transit time of the crest of a time- 
harmonic wave (optical length). The latter is the integral of the ray 
index along the ray path, evaluated at a fixed frequency w. 
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We need the Hamilton equations in one more form, in which the z 
axis is singled out. For media that are invariant in the z direction, it is 
convenient to solve (14a) for k,. Ignoring the x coordinate, we have 


H=k, — kz(w, ky, y) = 0, (18a) 
and the ray equations are, from (14b), 
dy/dz = —0k./0ky, 


dt/dz = dk./ dw, (18b) 
dk,/dz = 0k./ dy, 


where w and.-k, are constants of motion. If the surface y, z is isotropic, 
k enters only through its magnitude k. Thus, 


kz = (wy) — ky, (19) 
and (18) becomes 
dy/dz = k,/kz, (20a) 
dky/dz = 3(dk*/dy)/ke, (20b) 
dt/dz = 4(0k?/dw)/kz. (20c) 


These are the expressions used in the main text. Equations (20a) and 
(20b) give the rate of change of the ray position and momentum as a 
function of z. Equation (20c) gives the time of flight of a pulse by direct 
integration. We now show that this result can be obtained from the 
J.W.K.B. approximation of the wave optics solution. 

The scalar Helmholtz equation is obtained from the substitution 


ky ~ —10/dy (21) 

in (19). We obtain 
[d?/dy? + (wv, y) Wn = kemW my (22) 
where m = 0, 1, 2, ---, for trapped modes. Given k(w, y), we look for 


solutions of (22) that are square-integrable and obtain the time of 
flight of a pulse in a mode m over a unit length by differentiating kzm 
with respect to w, 


T = 1/v, = Okim/ dw. (23) 


Instead of solving (22) for k, and differentiating with respect to w, 
we may use the Hellmann-Feynman (H.F.) theorem.!® Let 3 be a 
self-adjoint operator depending on a parameter w, 


5C(o)¥m = EnV | (24) 
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Premultiplying both sides of (24) by Ym we obtain 
En = (WmIC(w)Wm)/ (WmWm), (25) 


where 
(ab) = [- a*(b@ay, (26) 


It is not difficult to show that £,, is stationary with respect to a small 
change in Ym. Thus, when we differentiate (25) with respect to w (or 
w?), we can ignore the dependence of Ym on w (or w*). We have for y 
a real 

DEm _ Wm(A5t/des® Wm) co 
du” (Wm m) 
In our case, (22), 

H(w) = d/dy? + k?(a, y). (28) 


Thus, by application of the H.F. theorem we obtain 
(c/Ug)m = (Ko/Kem) (dkzm/dke) = (ko/Kem) 
-++eo -boo 
x [7° (ate/ariwady / f° vady = (helen) (OK/ 982m. (29) 


The J.W.K.B. method shows that, for large m, a mode can be repre- 
sented by a manifold of rays satisfying the Bohr-Sommerfeld condition 


g =n oe (30) 


where the integral on the lefthand side in (30) is the area enclosed in 
phase space (k,, y) by a ray trajectory. Equation (30) expresses the 
uniqueness of the phase of the field. At the turning point, k, = 0, 
y = &m, we have from (19) 


kem = k(w, m). (31) 


An alternative way of obtaining the time of flight of a pulse in a 
mode m is to integrate ds/u from z = 0 to 1 along a ray of the mani- 
fold. The arc length is denoted by ds = (k/k.)dz and u~! = 0k/dw is 
the inverse of the local group velocity. Thus, 


af { RNC EN ge pa (NT »: 
t= f(s Ne)eao (eae 9) 
This expression, (32), in which ( ) denotes an average taken along a 


ray period, is the semiclassical analog of the Hellmann-Feynman theo- 
rem eqs. (27) and (29), and is used in the main text. It can be obtained 
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alternatively by noting that the group velocity in a waveguide is the 
ratio of the total energy flow to the energy stored per unit length. 
(This is a special case of the theorem derived in Ref. 16 for periodic 
bi-anisotropic media. To obtain the result applicable to open wave- 
guides, we only have to let the periods go to infinity.) The result, (32), 
follows by integrating the energy density along a ray pencil bounded 
by the rays y(z) and y(z + dz). Let us sketch the proof. If Pdz denotes 
the energy flow in this ray pencil, the total energy flow in the wave- 
guide is PZ. The energy density, on the other hand, is P/usin8@, 
where 6 is the angle that the ray makes with the z axis. Thus, the energy 
per unit length is obtained by integrating Pds/u along the ray, in 
agreement with (382). 


APPENDIX B 
Square-Law and Linear-Law Media 


In this appendix we work out the case of square-law and linear-law 
media because they lend themselves to exact analytical expressions 
that are useful for comparison with computed solutions. The case in 
which the wave number k varies quadratically with y is also useful to 
obtain first-order solutions. Let us consider this case first. 


I(w, y) = ko(w) — P(w)y?, (38) 
where the functions k,(w) and Q(w) are arbitrary. The wave equation, 
(22), is 

(0?/dy? + ko — Wy?) = ky, (34) 
where y represents, for instance, the y component of the electric field 


for H modes in a dielectric slab. This equation has the well-known 
eigenvalues 


kB = k3 — (2m + 1)Q. (35) 
Thus, 
T = 1/v, = dk./dw = [koko — (m + OTR — Qm + 1a}? 
= ky + (Q/ko) (ko/ko — 2/2) (m + 4) + (0/K5) 

x [(3)ko/ko ia Q/2](m = 2)? = as =) (36) 
where upper dots denote differentiation with respect to w. The condi- 
tion for the removal of the first-order terms in (36), k/k, = /Q, is the 
same as the condition of stationarity of v/u = wk~*}(dk?/dw) given in 


the main text. (Note that m is proportional to 63. Thus, first-order 
terms in m correspond to 6? terms.) 
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Let us now show that this result can be derived from the ray equa- 
tions. Equations (19) and (20) are 


dy/dz = ky/kz, (37a) 
dk,/dz = —Qy/k., (37b) 
d?y/dz* + (Q/k.)*y = 0. (37c) 

The solution of these equations is straightforward. We obtain 
Y = (Kkyo/Q) sin [(Q/k.)z], (38a) 
ky = ky. cos [(Q/k.)z], (38b) 

where 

kyo = k2 — ke, (38¢) 


if we specify, for simplicity, that y(0) = 0, and use (33). The quantum 
condition, (30), is therefore 


kin = (2m + 1)Q. (39) 
Thus, setting k,,./Q = £, the axial wave number is given by 
kz = (wv, &) = ke — OL = kB — (Qn + 1)Q, (40a) 


in exact agreement with (35) (the agreement needs to be exact only 
for square-law media). 

The ratio of the optical length of a ray period (period Z) to the cor- 
responding length on axis is 


Z Z 
R = (IooZ)" | kds = (lok,Z)— ip kode 
0 0 


Z 


= (k,k,Z)7 {k2 — kz, sin? [(Q/k.)z |} dz 
0 
= (1 — 4sin?6,)/cos@, = 1+ 6/8+---, (40b) 


where 8, denotes the angle between the ray and the z axis at the origin. 
By comparison, we have for a clad slab 


R, = 1/cos 6.41 + 6/2+ --:. (40c) 


Thus, for small 6,, R — 1 is much smaller than R, — 1, as discussed in 
the introduction. The above results, (40b) and (40c), are significant in 
the problem of pulse spreading in graded-index fibers if the material 
has low dispersion, but they are not relevant to tapered dielectric 
slabs. They are given here only for comparison. 
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Let us now evaluate the group velocity by integrating ds/u along a 
ray of the manifold, following (32). We have 


1 (7 (9k, a0 
yr aad (beZy5 f (= an Qo ') dz, (41) 


where Z = 2rk,/Q denotes the spatial period, and y(z) is given in 
(38a). The integration is straightforward. Using (39), a result identical 
to (36) is obtained. Note that the above results are exact; the paraxial 
approximation was not made. We have shown in Appendix A that it 
is legitimate to evaluate pulse spreading by integrating the inverse of 
the local group velocity along rays representing the modes of propaga- 
tion, in the limit of large mode numbers. The agreement is now found 
to be exact for square-law media. 
For a linear-law medium with 


(wo, y) = ko (w) — 2a(w) lyl, (42) 
we shall only give the results. The rays are, from (20), 
0<2<Z/2 
= 2 2 2 
y(z) = tan 6,2 = (a/2k3 cos? 6,)z Z)2<2<F, (43) 
with a period 
Z = 4k? sin 6, cos 0,/a. (44) 


The ratio of the optical length of a ray to the length on axis is 


[ras / [tote = (1 — 2 sin? 0,)/cos 6. = 1— 67/6 + ---. (45) 


The situation is opposite to that of a clad fiber: The optical length 
decreases as 6, increases. Therefore, we may in that case have a small 
increase of v/u when the slab thickness is reduced, that is, work on the 
right side of the dotted line in Fig. 2. This leads to a thicker slab than 
in the case of square-law profiles. These theoretical results are con- 
firmed by the curves in Figs. 5 and 6. We note that the optimum ¢, 
is about 1, the maximum of the v/u curve being at only 0.8. The time 
of flight is, using (32), 


T = 1/v, = (cos @)—dk,./dw — (23/6) (k./a) (sin? 6,/cos 8.) (da/dw). 


Thus, 7’ is independent of 6, for small 6, (no terms in 63) if k,(w) and 
a(w) in (42) are related by 


(dk,/dw)/k, = (23/3) (da/dw)/a. (46) 
It can be shown that this condition corresponds to an increase of u/u 


with |y|, in agreement with the previous discussion. 
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Theory of the Single-Material Fiber 


By D. MARCUSE 
(Manuscript received March 6, 1974) 


The term “single-material fiber’”’ describes a dielectric optical waveguide 
made of only one type of glass. The theory of this waveguide is simplified 
by placing the structure between two perfectly conducting planes that have 
very little influence on the properties of the low-order modes. 

The field distribution and propagation constant of the lowest-order mode 
are investigated and compared to an approximate theory. 


I. INTRODUCTION 


A dielectric optical waveguide made entirely of one type of material 
is called a ‘‘single-material fiber.’””! Figure 1 shows such a structure 
schematically. It may be regarded as a rectangular dielectric wave- 
guide supported by two infinitely extended slabs made of the same 
material. Such a structure has been shown to be capable of supporting 
modes that are concentrated near the enlarged section of the wave- 
guide and that do not lose power by energy seepage into the slabs.!? 
Single-material fibers are usually made of pure fused silica. Since no 
other material is needed to form a waveguide, the low-loss properties 
of pure fused silica can be fully utilized.® 

The single-material fiber has been described by means of an approxi- 
mate theory by Marcatili.! The theory presented here serves the pur- 
poses of proving that truly guided modes do indeed exist in single- 
material fibers and of providing more precise solutions for comparison 
with the approximate theory. 

An analysis of the guided modes of the single-material fiber is pre- 
sented in this paper. The mode field is expressed as a superposition of 
the guided modes as well as the radiation modes of the two types of 
slabs. The enlarged region, henceforth called the core, can be regarded 
as a slab joined by narrower support slabs on either side. Since the 
radiation modes of the slabs have a continuous spectrum of eigen- 
values!» (propagation constants), their contribution to the total field 
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; Hig. 1—Single-material fiber showing the rectangular core attached to its support 
slab. 
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consists of an integral that must be approximated by a sum for pur- 
poses of numerical analysis. In addition, the mathematical expression 
describing the guided and radiation modes are different so that the 
analysis becomes rather complex. 

To simplify the analysis, it is convenient to consider the single- 
material fiber enclosed between two perfectly conducting planes, 
as shown in Fig. 2. Since the fields of the guided modes of the slabs, 
and hence the field of the guided mode of the fiber, are very tightly 
confined near the dielectric structure, the presence of the perfectly 
conducting planes does not appreciably influence the shape of hori- 
zontally polarized fields. However, the simplification of the analysis is 
considerable, since the modes that correspond to the guided modes of 
the open slab and the waveguide modes of the parallel plate system 
(corresponding to the radiation modes of the open slab) are now de- 
scribed by one analytical expression and belong to a system of discrete 
eigenvalues. There is, therefore, no need to worry about a suitable 
approximation to the integral over the radiation modes of the slabs. 
Vertically polarized fields (polarized in the y-direction) are strongly 
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Fig. 2—For our analysis, the single-material fiber is placed between two perfectly 
conducting planes. 
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influenced by the presence of the metal plates, since the normal field 
components reach the metal plates with maximum intensity. For this 
reason, we limit the study of the single-material fiber to horizontally 
polarized modes. Vertically polarized modes could be treated if the 
perfectly conducting planes were replaced by magnetic short circuits. 
(An explanation of the negligible influence of the perfect conductors is 
given in the appendix.) 

After formulating the exact solution to our problem, we present 
numerical approximations for the field distributions and the solution 
of the eigenvalue equation. The theory is compared to an approximate 
analysis. 


Il. CALCULATION OF THE MODES OF THE SINGLE-MATERIAL FIBER 


The electric and magnetic fields of the modes of the single-material 
fiber are expressed as 


E® = 3 co gO (1) 
v=1 
and 
H® = > ge, (2) 
v=] 


The script symbols indicate modes of the slabs. The superscript 7 
assumes the values 1 and 2. Value 1 indicates modes of the wider slab 
that forms the core of the single-material fiber, while value 2 indicates 
the modes of the narrower supporting slabs. 

The modes in the core region are those of a metallic parallel plate 
waveguide. These modes can be designated as TE modes with 6? = 0 
and TM modes with 3c{? = 0. We have for the TE modes in region 1 


&Y = 0 (3) 
and 


RY = A, cos (kz,x) sin (ky,y)e7**. (4) 
ed! 


We use odd integers, v = 1, 3, ---, to label these modes. In addition 
to the sine and cosine functions appearing in (4) we could also use the 
other three possible combinations. We restrict ourselves to the modes 
shown here, thus limiting ourselves to the study of fiber modes of a 
certain symmetry. All other modes can be obtained similarly. 

The other field components can be obtained from &{” and 3c!” by 
differentiation (see, for example, page 13 of Ref. 4 or page 51 of Ref. 5). 
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The parameters appearing in (4) are related as 


nk? = ky + ky, + 8B’, (5) 
with 
20 
k=o coo = (6) 


The refractive index of the single-material fiber is designated by n and 
the refractive index of the medium outside the fiber is taken to be unity. 
The angular frequency is w, and eo and yp are the dielectric permittivity 
and the magnetic permeability of vacuum. 

The TM modes are labeled by even integers, v = 2, 4, ---, and are 
obtained from the field components 


6 = pals sin (kz,x) cos (ky,y)e~** (7) 
WED 


and 
Y = 0. (8) 


TE and TM modes must satisfy the boundary conditions that 62 and 
{> vanish at y = +d. These conditions are met if we use 
Tv 
ky, = (2u, — 1) od (9) 
Equations (5) and (9) are the same for TE modes (odd values of 7) 
and TM modes (even values of v). The integers u, assume the values 


Ww = 1 for y=1,2 


by = 2 for y= 3,4 (10) 
by = 3 for y= 5,6 
etc. 


The TE and TM modes are mutually orthogonal. Their amplitude co- 
efficients can be related to the amount of power in the core region by 
means of the equation 


1 b d 
PS 5 dx [ dy(&) X H>)z. (11) 
—b —d 


The asterisk indicates complex conjugation, and the subscript z labels 
the z component of the vector. From (3), (4), and (11) we obtain 


2wpo(kin + kzy)?P , 


aa| (k2, + k2,)b + (k2, — #3.) sen ; (12) 


A, = 
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From (7), (8), and (11) follows also 
2w eo (ky, + Ra el ag 
ae n2pd| (kz, + ki,)b — (k2, — k2,) 


toe 


sin 2k,,b| ¢ - (13) 
Zhe 


It is apparent from (5) and (9) that kz, can assume positive as well as 
negative values. 

We choose P equal to the unit of power. With this normalization, 
[ci |? appearing in (1) and (2) measure directly the power carried by 
each mode. 

Next, we turn to the modes of the narrower support slabs. Since the 
perfectly conducting planes do not touch the support slabs, their 
modes are more complicated. Strictly speaking, we do not have TE or 
TM modes with reference to the z direction. However, if we refer the 
labels TE or TM to the direction of propagation of the modes in the 
x-z plane, we do indeed have transverse electric and transverse 
magnetic modes. Used in this sense, we obtain the following z com- 
ponents of the TE modes of the support slabs: 


Cye7taz (1 el—9) cos (oyry)em *F ly| <i 
2 COS Fy rt iPass 
oleic (Cr sa 
xX sin [o.(|y| — d)Je** tS ly| Sd (14) 

and 

2B C1 g-ive(l2I-) gin (oyyy)eniP* ly| St 

UWazyWLo 
HOD 2 8 y Sin yt 








- ‘ em tezr (1z|—d) 
10 zyWpLo ly| cos po(d —t 


x cos[p(|y| — d)Je** t5 ly| Sd. (15) 


For TE modes, we have &{? = 0. 
Maxwell’s equations are satisfied if the parameters appearing in 
these field expressions satisfy the following relations: 


nk? = of, + ow + 6 (16) 
and 
kK? = ot + pe + 6. (17) 


We again use odd integers v to label the TE modes. 

The field expressions satisfy the condition of vanishing tangential 
electric fields at the perfectly conducting planes. To satisfy the bound- 
ary conditions at the surface of the slab, the x dependence of the field 
expressions must be identical inside as well as outside the slab. For 
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this reason, the parameter cz, is common to the fields for ¢ 2 |y| and 
¢t = |y| Sd. Since the fields must also satisfy boundary conditions 
along the planes x = +b for all values of z, the parameter 6 must be 
the same for all field expressions, where # is the propagation constant 
for the mode of the single-material fiber that is yet to be determined. 

The requirement of continuity of the field components 62, &2, 
axe? and 3 at y = +t leads to the eigenvalue equation 


tan cyt = - cot [p,(d — 1)]. (18) 
yv 

Equation (18) determines the allowed values of o,, and p,, since ac- 

cording to (16) and (17) we have 


p> = oy — (n? — 1)k*. (19) 


Although oj, is always positive, p? can be positive as well as negative. 
Modes with negative values of p? correspond to the guided modes of an 
open slab. For negative values of »?, the cotangent function on the 
right-hand side of (18) becomes a hyperbolic cotangent function that 
approaches unity for large values of its argument. The eigenvalue equa- 
tion (18) is thus identical [for large values of |p,|(d@ — #)] to the eigen- 
value equation (8.3-16) on page 308 of Ref. 4 for even TE modes 
of the slab waveguide. 

Modes with positive values of p? correspond to the radiation modes 
of the open slab. However, instead of the continuous spectrum of 
radiation modes,*> we now have a discrete spectrum of modes that 
approach the modes of the metallic parallel plate waveguide in the 
limit of vanishing slab thickness 2¢. The guided as well as the radiation 
modes of the narrow support slabs are thus represented by the same 
analytical expressions (14) and (15). The presence of the perfectly 
conducting planes has the added advantage of causing the mode spec- 
trum to be discrete. 

The parameter o2, can also be positive as well as negative. Positive 
values of o3, correspond to real values of o,,, so that the mode fields 
(14) and (15) represent traveling waves that carry power away from 
the core region into the slab. Coupling the modes in the core region and 
the slab regions thus results in a leaky wave. It is clear that we obtain 
guided single-material fiber modes only for negative values of o%,. We 
see from (16) that all 0%, are negative if o7, of the lowest-order mode is 
negative, because the values of o,, increase with increasing mode 
number. It is thus immediately apparent that lossless guided modes of 
the single-material fiber are indeed possible. Neither the guided modes 
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of the slab (those tightly confined to the slab region) nor the “un- 
guided” modes, which correspond to the radiation modes of the open 
slab, carry away power; all decay exponentially in x direction. This 
argument is not changed when we let the metallic plates move to 
infinity so that we obtain a truly free single-material fiber. The single- 
material fiber is thus seen able to support guided modes whose fields 
are confined to the vicinity of the fiber core. The existence of these 
guided modes is contingent on sufficiently large values of 8. Whether 
such solution with large 8 values really exist depends on the solutions 
that we must yet derive of the eigenvalue equation for 8. However, 
even at this stage we can state that guided modes that do not suffer 
radiation losses are possible at least in principle. 

Using (11), (14), and (15) we can again relate the amplitude co- 
efficient to the power unit P: 


C, = 1B + 3) |b + {cos® oy,t/sin® [pd —D]}@—H +- (20) 
— (n? — 1)k? sin (20,,t)/20,,p>| 


The TM modes of the support slabs are labeled by even integers v 
and follow from their z components: 


D,e-*72"(l2l-®) cos (cy yy)e7* ly| S¢ 


av 


oreo e—tezy (| z]—b) 
sin [p,(d — t)] 





X sin [p(|y| — d)Je** tS ly| Sd, (21) 
to 2yn?k? ‘ ; 
peel Ae etlee —tozy(|z|—d) vy )e*Bz <i 
Seen Dye sin (oy,y)eé ly| Ss 
YO = + Io2,n*k? SIN Gyst Ye gasosptlsl=3) 


wpoBoy» ” cos [p,(d — t)] |y] 
X cos[p(ly| — d)je** tS |y| Sd. (22) 


For TM modes we have xX = 0. 
The parameters oz,, oy,, and p, are again related by (16), (17), and 
(19). The eigenvalue equation for TM modes is 


1 oy, 
tan o,,t = — — 
mn py 





cot [p,(d — t) ]. (23) 


For large imaginary values of p,(d — t), (23) becomes the eigenvalue 
equation (8.3-45), page 313 of Ref. 4, for odd TM modes of the free 
slab. 
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Finally, we have 


Awp Boi | Ory | , 
aie 2 5 sin? oy yt 7 
D, = nk (8 = ozv) t+ n cos? [p,(d = t) | (d t) ‘ (24) 
is (n? — 1)k*? sin 20,,t 
X 20 yp 


Now we have written down the field expressions for the mode fields 
that must be substituted into the series expansions (1) and (2) for the 
mode of the single-material fiber. It remains to match the field in the 
core of the single-material fiber to the field in the regions of the support 
slab. We need to require continuity of E,, H,, H., and H, only along 
the line x = b and 0 < y < d, since the boundary conditions in the 
remaining three quadrants are satisfied for reasons of symmetry. Since 
the numerical analysis can handle only a finite number of equations, 
we require continuity of the tangential field components only at a 
finite number of points. Adjusting the size of the series expansion to 
the number of matching points, we obtain a finite, homogeneous equa- 
tion system for the determination of the expansion coefficients cf. 
This equation system can only have a solution if the determinant 
vanishes. The condition of vanishing system determinant provides 
the eigenvalue equation for the propagation constant 6 of the single- 
material fiber. 


iil. SPECIAL CASES AND APPROXIMATE SOLUTIONS 


In the limit ¢ = 0, an exact solution of the guided-mode problem is 
easily obtained. Since, in this case, the distributions of the fields in the 
two regions have the same y dependence, the boundary conditions 
along the plane « = +b can be satisfied without resorting to the 
series expansions (1) and (2). Using the field expressions (8), (4), (7), 
(8), (14), (15), (21), and (22) and requiring continuity of H., E,, H., 
and H, at x = b leads to the eigenvalue equations 

kz 
tank,b = — = (25) 
10x 
or 


Wo 
ne 8. 


tan k.b (26) 


For guided modes we have 
ioz = 1, (27) 
with real positive 7. 
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Equation (25) is the eigenvalue equation for odd TE modes of a 
slab waveguide, while (26) is the eigenvalue equation for even TM 
modes of the slab.4 When the amplitude coefficients of the superposi- 
tions of the TE and TM modes (38), (4), (7), and (8) (that are deter- 
mined with the help of the boundary conditions) are substituted into 
the field expressions, we obtain for the field in the fiber core belonging 
to (25) 


E,= —- # F sin (kzx) cos (kyy)e~*# lx| <b (28) 
and 
#4 So desk) anh lz] <b. (29) 
0 


For this mode we have H, = 0. Viewing this field from the boundary 
of the slab, x = b, we see that the normal electric field component 
vanishes. This is typical for TE modes of the slab waveguide so that it 
is not surprising that the propagation constant of this mode is deter- 
mined by an eigenvalue equation of the TE type. 
The mode belonging to (26) has the following z components: 
E. k.8 


he G sin (kz) cos (kyy)e7*# lz| <b (30) 
y 


Il 


and 


Hi, 


SO sos tea yen aes jz] <b. (BL) 
who 


This mode has H, = 0. With respect to the surface x = b, it is indeed 
a TM mode. 

For simplicity, the fields outside the core are not stated. However, 
a good approximation to these field expressions is obtained by using 
(21) and (22) to extend the field (28) and (29) outside the core and 
similarly by using (14) and (15) with the core fields (80) and (81). 

For t 0, the mode field of the single-material fiber can only be 
described by an infinite series of modes. However, we find a crude 
approximation by using only the first two terms in this series expansion 
and obtain an eigenvalue equation by requiring that a certain wave 
impedance be matched at the interface x = b. We stated earlier that 
only those modes of our structure with vanishing normal field compo- 
nents at the metallic planes resemble modes of the true single-material 
fiber. The mode field (28) and (29) with #, = 0 has a strong normal 
component of the electric field. We thus limit ourselves to the hori- 
zontally polarized field and use (30) and (31) as a crude approxima- 
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tion. Note that the field (80) and (31) consists of a superposition of 
one TE mode [egs. (3) and (4) ] and one TM mode [egs. (7) and (8) ]. 

The wave impedance, 
EE topoks 


H, nk? 





tan kb, (82) 


obtained from (30) and (31) does not depend on the y coordinate. 
Similarly, we use (14) and (15) to form 


E, WL oo x 

H, SS nek? — oe (33) 
which is also independent of y. Since the tangential field components 
must be continuous at the boundary x = b between the two regions, 
we require that (32) be equal to (33), obtaining the following approxi- 
mate eigenvalue equation for the horizontally polarized modes (of only 
a certain special symmetry) of the single-material fiber : 


to, nk? 


tan kb = ke ee — o2 


(34) 
The parameter c, must be obtained as the solution of the eigenvalue 


equation (18). In the limit ¢ = 0, (34) should reduce to (26). To see 
that the correct limit is obtained, we use (19) to write 


nek? — of = k? — p?. (35) 


For ¢ = 0, we obtain from (18) 


p = (Qn — 1) a (36) 
For small values of the integer » and kd >> 1, we have p< k so that 
(26) and (34) become indeed approximately the same. We do not get 
exact agreement, since we approximated the field outside the core by 
(14) and (15) instead of using the exact field expressions. We see that 
our eigenvalue equation (34) is a good approximation in the two limit- 
ing cases, t > 0 and t > d. Once a, has been determined from (18) we 


find » = tc, from 
n = Voy — kz — ki, (37) 


and (34) with the help of (9). The propagation constant 6 can then be 
obtained from (5) or (16). 
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IV. DISCUSSION AND NUMERICAL EXAMPLES 


Marcatili has shown by an approximate analysis! that the single- 
material fiber can be made to support only a single guided mode, even 
if its dimensions are large compared to the wavelength, if the ratio 
(1/4) (bd/t?) approaches unity. However, for large values of kd and 
large values of bd/t?, the single-material fiber supports a large number 
of guided modes. 

We are limiting our discussion to the lowest-order guided mode. 
Since the properties of the single-material fiber can be obtained ade- 
quately from the approximate solutions, it is our principal purpose to 
show how well the approximate solution (34) and Marcatili’s approxi- 
mate theory work, and to study the field distributions of the exact 
solution that cannot be obtained from the approximate analysis. As 
indicated earlier, we limit the discussion to the modes with horizontal 
polarizations (HZ, = 0), since the vertically polarized modes (£, = 0) 
are very strongly influenced by the presence of the perfectly conduct- 
ing planes that were used only to simplify the analysis.* The analysis 
is further restricted to modes whose /’, component is a symmetric func- 
tion in both x and y. The modes with other symmetries can be obtained 
similarly by using slab waveguide modes of the appropriate symmetries. 

All the numerical examples shown here were computed for the 
following choice of parameters: 


d/\ = b/X = 5 


n= 1.5. (38) 


The boundary conditions at the plane x = 6 were satisfied by matching 
the fields at 10 points evenly distributed between y/d = 0.05 and 
y/d = 0.95. As a consequence, the field expansion uses 20 modes in 
each region, 10 TE modes and 10 TM modes. Adequate accuracy was 
obtained this way. However, an expansion using only 6 points to 
match the fields did not appear sufficiently accurate. 

The computer program was written to solve, first, the eigenvalue 
equations (18) and (28) by an iterative search procedure. Next, the 
computer was instructed to use a large trial value for 6 and compute 
the normalized field amplitudes (12), (13), (20), and (24) as well as the 
matrix elements of the equations system resulting from the boundary 
conditions at the N matching points. Next, the system determinant 
was examined and @ was decreased until the determinant changed its 


* The case of vertically polarized modes can be treated by replacing the electrical 
short-circuit planes with magnetic short circuits. 
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sign. By narrowing the increments for 8 successively and oscillating 
around the point where the sign change of the determinant occurred, 
an approximate solution for 8 was determined. Since the order of mag- 
nitude of the determinant was not known a priorz, no attempt was 
made to reduce the value of the determinant below a certain limit. 
Once an approximate eigenvalue had been found, the coefficient c{” was 
set equal to unity, and the first equation of the system was omitted. 
The remaining equation system was solved by inverting the reduced 
coefficient matrix. The values of the expansion coefficients were finally 
used to calculate the magnitude and direction of the electric field in a 
grid of preselected points in the z-y plane. 

Figures 3 to 6 compare the magnitude of the electric field vector of 
the lowest-order mode of the single-material fiber with the magnitude 
of the field of the rectangular waveguide if t/d = 0. Figure 3 applies to 
a single-material fiber with the dimensions given by (38) and with 
t/d = 0.82. The magnitude of the field intensity is plotted as a func- 
tion of x«/b for different values of y/d. It is apparent that the field 
intensity decreases with increasing values of y. The field is strongest on 
axis and vanishes at y = d. In the absence of metallic planes, the field 
would not be zero at y = d, but would decrease to a very small value. 
The solid curves indicate the field of the single-material fiber, while the 
broken curves apply to the rectangular waveguide (¢ = 0). In the 
region of the guide where the support slab is present, y/d < 0.32, the 
field of the single-material fiber reaches out much further than the field 
of the corresponding rectangular waveguide, since it penetrates into 
the slab. For y/d > 0.32, the field shape of the single-material fiber 
has become identical with the field distribution of the rectangular 
waveguide. 

Figure 4 shows the field distribution as a function of y/d for four 
different values of «/d. The solid curves describe again the field of the 
single-material fiber, while the broken curves belong to the rectangular 
waveguide. In the y direction, both fields vanish at y = d but, near 
the edge of the single-material fiber, its field intensity is quite different 
from the rectangular waveguide field. We have plotted the ratio of 
the field intensity to the maximum value (the value that the field 
assumes for each value of x/d) at y/d = 0. Far from the edge of the 
core, the single-material fiber field is identical to the field of the rec- 
tangular waveguide. However, near the edge, at x/b = 1, the field is 
strong in the region 0 < y/d < 0.82, since it is allowed to penetrate 
into the support slab. But in the range 0.32 < y/d < 1, where it en- 
counters the dielectric interface, it is relatively much weaker. The 
field of the rectangular waveguide is likewise weak near the dielectric 
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HORIZONTAL POLARIZATION 


Seis mes t= 0 BA= 9.40965 


t/d=0.32 BA=9.41043 y/d = 0.95 


y/d = 0.55 


y/d = 0.05 





8) 0.2 0.4 0.6 0.8 1.0 1.2 1.4 
x/b 


Fig. 3—Magnitude of the electric field vector shown as a function of the normalized 
horizontal dimension z/b. Solid curves describe the single-material fiber with 
t/d = 0.32, and broken curves apply to the fiber with t/d = 0. 


interface; it appears strong only because of our normalization with 
respect to the maximum field intensity at y/d = 0. For z/b > 1, the 
rectangular waveguide field is no longer plotted since it decays rapidly 
to insignificant values outside the waveguide core. The single-material 
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t/d = 0.32 


IE 
Em 





0 0.2 0.4 0.6 0.8 1.0 1.2 
y/d 


Fig. 4—Magnitude of the electric field vector relative to its maximum value at 
y = 0 as a function of the normalized vertical dimension y/d. Solid and broken 
curves describe the single-material fiber with t/d = 0.32 and t/d = 0. 


fiber field shows the distribution typical of the lowest-order mode in 
the support slab. 

Figures 5 and 6 show the same behavior for a single-material fiber 
with a much wider slab, t/d = 0.8. The field penetrates even further 
into the support slab, as can be seen from Fig. 5. However, the field 
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y/d = 0.85 





x/b 


Fig. 5—Magnitude of the electric field vector shown as a function of the normalized 
horizontal dimension z/b. Solid curves describe the single-material fiber with 
t/d = 0.8, and the broken curves apply to the fiber with t/d = 0. 


distribution in the vertical plane, shown in Fig. 6, is now much closer 
to the field distribution in the core of the rectangular fiber. 

Figures 7 and 8 show the mode spectra for the single-material fiber 
with t/d = 0.32. Figure 7 presents the mode content of the field in the 
core. Because of our normalization, the square of the mode amplitudes 
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0 0.2 0.4 0.6 0.8 1.0 1.2 
y/d 


Fig. 6—Magnitude of the electric field vector relative to its maximum value at 
y = 0 asa function of the normalized vertical dimension y/d. Solid and broken curves 
describe the single-material fiber with t/d = 0.8 and t/d = 0. 


cS” represents the relative power carried by each mode of the series 
expansion (1) and (2). The broken vertical lines give the mode content 
of the corresponding mode of the rectangular waveguide. It is remark- 
able how nearly identical the mode amplitudes of the lowest-order TE 
and TM modes are in either case. Note that the mode amplitudes of 
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Fig. 7—Mode spectrum of the lowest-order single-material fiber mode inside its 
core. The solid vertical lines describe a fiber with t/d = 0.32, the broken vertical 
lines belong to the case t/d = 0. 


the higher-order modes vanish because of the presence of the perfectly 
conducting planes; without them, the rectangular waveguide modes 
would also have to be represented by infinite-series expansions with a 
very slight mixture of higher-order modes. The mode of the single- 
material fiber consists of a mixture of the higher-order modes required 


HORIZONTAL POLARIZATION 
mmm t= 0 


0.10 t/d = 0.32 


}«- — ~GUIDED MODES — —>{«— — — — —raniation mopes~ — — — — 
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Vv 


Fig. 8—Mode spectrum of the lowest-order single-material fiber mode in the region 
of its support slab with ¢/d = 0.32. The short vertical broken line represents the mode 
content of the fiber with ¢ = 0. 
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to produce the field distortions in Figs. 4 and 6. The rise of the mode 
amplitudes for modes with »v > 12 is not truly representative of the 
actual mode content. If the mode number is varied in the numerical 
approximation, it is found that, near the last mode, v = N, the mode 
amplitudes always tend to assume increased values. The appearance 
of the mode spectrum is thus somewhat dependent on the total num- 
ber N of modes used in the series expansion. However, the distribution 
of lower-order modes was found to be very similar for N = 16 com- 
pared to the spectrum shown in Fig. 7 for N = 20. Only the highest- 
order modes appear with different amplitudes. When N = 12 was 
used, a different mode spectrum and an implausible field distribution 
was obtained, indicating insufficient accuracy. 

Figure 8 shows the mode content of the field in the support slab. The 
mode amplitudes are much smaller, since much less power is carried 
outside the fiber core. The lowest-order TE mode is most prominent. 
The short broken line at vy = 1 represents the much weaker contribu- 
tion of the rectangular waveguide (¢ = 0). For our model, the lowest- 
order TM mode outside the core, v = 2, contributes slightly to the 
mode field of the rectangular dielectric waveguide, but its amplitude 
is too small to be visible on the scale of this figure. It is interesting 
that the field of the single-material fiber in the region of the support 
slab is represented to a very good approximation by the lowest-order 
TE mode of the support slab. The modes » S 8 are guided slab modes 
with imaginary values of p,; modes with v > 8 have real valued param- 
eters p, corresponding to the radiation modes of open slabs. 

Figure 9 is a plot of the direction of the electric field vector in the 
vicinity of the corner of the dielectric material at x/b = 1, y/d = 0.82. 
Far from this corner, the field is horizontally polarized. It is remarkable 
how little distortion is evident near the dielectric discontinuity. There 
is no peak in the field intensity at the sharp dielectric corner, and the 
field direction is likewise almost unperturbed. 

Finally, we present solutions of the eigenvalue equation (system 
determinant = 0). Instead of plotting values for the propagation con- 
stant 8, we present values for the relative effective width of the fiber 
core. If the core boundary at x = b were a metal wall we would have 


kz (39) 


ae 
~~ 2b 
for the lowest-order mode. The actual values of k, deviate from the 


value (39) partly because the dielectric discontinuity at x = 6 is not 
an electrical short circuit, and also because the field penetrates some 
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Fig. 9—Short arrows indicate the direction of the electric field vector of the lowest- 
order mode of the single-material fiber, with /d = 0.32 near the corner of the dielec- 
tric material where the support slab is attached to the core. 


distance into the support slab. We use the actual value of k, to define 
an effective core width 
Tv 


ies ok, 





(40) 


The value of k, is obtained from the solution 8 of the eigenvalue equa- 
tion with the help of (5) and (9) 


ke = [we —@ - (5) | (41) 


The solid line in Fig. 10 represents the relative effective width of the 
core for the lowest-order mode of the single-material fiber with the 
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Fig. 10—Relative effective width of the core of the single-material fiber as a 
function of the relative thickness of its support slab t/d for the lowest-order mode. 
The solid line is the result of the numerical solution of the complete theory; the 
broken line was calculated from the solution of the approximate eigenvalue equation 
(34) ; and the dash-dot line is the result of Marcatili’s theory. 


dimensions stated in (38). This mode does not suffer a cutoff. It can 
propagate without power outflow into the support slab for arbitrarily 
small values of 1 — (¢t/d). The computer program had difficulties solv- 
ing the eigenvalue problem for t/d > 0.8; thus, the solid curve is not 
continued beyond this point. The nonzero value of (b’ — b)/b at 
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t/d = 0 represents the field penetration of the rectangular waveguide 
mode outside the dielectric core. 

The upper dotted line of Fig. 10 is the result of solving the approxi- 
mate eigenvalue equation (34). For ¢/d near zero and near unity, the 
approximation is excellent. The departure of the approximation in 
the middle of the range is not surprising when we look at Fig. 4. 
The approximate solution uses the field distribution represented by the 
dotted line of Fig. 4, which is clearly a poor approximation of the 
actual field distribution. In fact, it is surprising how good the approxi- 
mate solution for (b’ — b)/b is, even in this case. Even though the 
solid curve does not extend past t/d = 0.8, we can trust the dotted 
curve in this region, since the actual field distribution becomes very 
close to the approximate distribution. This is evident from a compari- 
son of the solid and broken curves of Fig. 6. 

The large error in the approximation in Fig. 10 causes only a very 
slight error for 6. For t/d = 0.32, we obtain from the solid curve of 
Fig. 10 (b’ — b)/b = 0.1 corresponding to k,\ = 0.2856 or BA = 9.410438. 
From the broken curve we obtain (b’ — b)/b = 0.24, kz\ = 0.25384 or 
BX = 9.41135. The relative error in the 6 value is thus only A8/8 = 0.01 
percent. 

The dash-dot curve shown in Fig. 10 is a plot of eq. (15) of Ref. 1. 
This curve was plotted by using the following identification of the 
symbols in Ref. 1 with our symbols: 7 = 2t, W = 2b, and H = 2d. 
The dash-dot curve of Fig. 10 shows clearly how remarkably accurate 
Marcatili’s approximate theory describes the effective width and hence 
the propagation constant of the single-material fiber mode. His ap- 
proximation deviates more from the ‘“‘exact”’ solution (given by the 
solid curve) near t/d = 0 and t/d = 1 than does the dotted curve. 
The disagreement near t/d = 0 is caused by assuming that the field 
must vanish at the boundary x = b of the rectangular waveguide. 


V. CONCLUSIONS 


We have studied the properties of the lowest-order mode of the 
single-material fiber using a model that departs from the actual fiber 
by the presence of two perfectly conducting planes shown in Fig. 2. 
Horizontally polarized modes are not appreciably distorted by the 
presence of these planes. In particular, we are confident that the influ- 
ence of the support slab on the field distribution and propagation con- 
stant of the single-material fiber mode is represented very accurately 
by this model. The agreement of the model with metallic planes and 
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the free single-material fiber becomes better for fibers with large values 
of d/X. 

By representing the field of the single-material fiber as a superposi- 
tion of the modes of the dielectric slabs in the core region and in the 
region of the supports, we find solutions by matching the boundary 
conditions in the plane x = 6b at a finite number of points. We find that 
matching along 10 points in the range 0 < y/d <1 (requiring 20 
modes in each region of the guide) provides satisfactory accuracy. 

This study shows that the field of the single-material fiber in the 
vicinity of the edge, at x = b, departs considerably from the field dis- 
tribution that would result for ¢ = 0. However, for very narrow as well 
as very wide support slabs, a simple approximation using only the two 
lowest-order modes of the series expansion yields satisfactory results. 
Our theory thus serves the purpose of clarifying the range of appli- 
cability of approximate descriptions! of the single-material fiber and 
of inspiring confidence in the validity of such approximations. 

In particular, it is our aim to show that Marcatili’s approximate 
theory of the single-material fiber is indecd justified and yields very 
good results compared to our more precise treatment. 
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APPENDIX 


It is claimed that the analysis presented in this paper is an almost 
exact description of the single-material fiber, and yet the structure that 
is analyzed differs from the actual single-material fiber by the presence 
of the perfect metallic conductors attached to the fiber core, as shown 
in Fig. 2. In defense of this procedure, two remarks may be made here. 

The performance of the single-material fiber is dominated not by the 
dielectric-air interface on the two sides at y = +d of the fiber core 
but by the presence of the attached support slabs. The electromagnetic 
fields of the single-material fiber modes extend much further into the 
support slabs than they do into the air space outside the core, as shown 
in Figs. 3 through 6. The dielectric-air boundary acts almost like an 
electrical short circuit, so that the presence of actual short circuits 
at the dielectric-air interface at y = +d has a very slight effect. In 
particular, it is the radiation of power into the support slabs rather 
than into the air space outside the core that signals the cutoff of the 
guided modes. This behavior is described correctly by our analysis. 
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It is easy to estimate the field penetration into the air space above 
and below the core in the absence of the perfect conductors. The field 
outside the fiber is described by the functional dependence exp(— yy). 
The decay parameter y is defined as [see eqs. (1.2-14) and (1.3-44), 


Ref. 5]: 
ae G a= (5) ]- (42) 


With the numbers used in the numerical example, we obtain yA = 7. 
This means that, at a distance of \/7 from the air-dielectric interface, 
the field has decayed to'1/e? (or 14 percent) of its power density at the 
interface. Instead of having an effective electric short circuit at this 
distance, the presence of the metallic planes moves the short circuit 
a relative distance of 1.4 percent (in terms of the fiber diameter) 
nearer to the fiber core. This small change of the electrical width of the 
core has only a very slight effect on the field penetration into the 
slabs, which is the most interesting feature of the single-material fiber. 
Furthermore, this change can be taken into account by allowing the 
value 2d of the modified fiber to be 3 percent larger than that of the 
actual fiber. 

The cutoff condition of the modes follows from the eigenvalue equa- 
tion, which is the condition for the vanishing determinant of the equa- 
tion system resulting from the continuity requirements for the tangen- 
tial field components. At cutoff, the propagation constant 6 ceases to 
have real solutions, but becomes complex. No analytical expression 
can be given for the cutoff point. Its determination from the numerical 
analysis is difficult. In this respect, the approximate theory proves to 
be more powerful, since it is able to estimate the cutoff point. 
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Theory of the Single-Material, Helicoidal Fiber 


By J. A. ARNAUD 
(Manuscript received March 29, 1974) 


The theory of propagation in a new single-materral, single-mode, optical 
fiber is given. The modes are of the whispering-gallery type, with the 
propagation taking place along helicoidal paths close to the boundary of a 
cylindrical dielectric rod. The beams are confined in the azimuthal dtrec- 
tion in helicoidal ridges. It 1s shown that single-mode, low-loss operation 
is possible if the helix period is of the order of the rod cross-section area 
divided by the wavelength and the ridge area is of the order of 1 percent of 
the rod cross-section area for two channels. The rod is supported by heli- 
coidal wings that play a role in the mode-selection mechanism. 


I. INTRODUCTION 


The best-known single-mode optical fiber is the clad fiber. If the 
difference in refractive index between core and cladding is small, 
single-mode propagation can be achieved for core diameters that are 
large compared with the wavelength. It is, however, desirable to use 
just one material, such as quartz, that exhibits low impurity and scat- 
tering losses. In a previous work,! we indicated that single-mode propa- 
gation could be achieved in a single-material configuration that we 
called a ‘‘helicoidal fiber.”” Figure 1 represents a more recent version 
of this type of fiber. 

To explain the mechanism of operation, let us consider first a cy- 
lindrical dielectric rod with radius a = p. The refractive index of the rod 
is perhaps n = 1.45 (quartz), and the surrounding medium is air. Waves 
are guided along the rod boundary as shown in Fig. 2a. These so-called 
‘“‘whispering-gallery modes’ can be represented by rays repeatedly 
reflected from the boundary because of total reflection. In the interior 
of the rod, the modes are described by Bessel functions J,(kr) 
X exp (ivd), where » is a large integer and kr is a large number of the 
order of v. Because kr and v are both large and comparable to one 
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Fig. 1—Open view of the single-material, helicoidal fiber for two optical channels. 
The two optical beams propagate in the two ridges shown, with areas t X 2d. The 
helicoidal motion is essential to maintain confinement. High-order modes are not 
confined. They radiate away to the envelope through the wings (only part of one is 
shown). The ratio period/diameter is much larger than that shown in the figure. 


another, the Bessel functions can be approximated by Airy functions. 
The field is oscillatory from the rod boundary down to a slightly smaller 
radius r, called the caustic (or turning point) radius. For radii smaller 
than r., the field decays exponentially. Thus, the field of whispering- 
gallery modes clings tightly to the rod boundary. The distance between 
the caustic and the boundary, which defines in some sense the ‘‘thick- 
ness” of the mode, is for the fundamental mode of the order of (\’a)3, 
with \ the wavelength in the medium and a the rod radius. For ex- 
ample, if \ = 1 um and a is equal to 8 mm, the fundamental mode 
thickness is of the order of 20 um. It increases with the mode num- 
ber m, approximating as m*. As m increases, the phase velocity in- 
creases too. 

These whispering-gallery modes can be generalized to take into 
account a motion along the rod axis z. The combined rotation and axial 
motion results in a helicoidal path that can be understood from simple 
ray-optics considerations. The only significant difference from the 
previous case is that the radius a of the rod should be replaced, in the 
expression for the mode thickness, by the helix radius of curvature, 
p = a/sin? 6, where 6 denotes the angle that the helix makes with the 
rod axis. For example, if a is 80 um and 6 = 0.1 radian, the mode 
thickness is the same as in the previous example, where a was assumed 
to be 8000 um. If @ is equal to zero, there is of course no confinement 
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Fig. 2—(a) Whispering-gallery modes clinging to a circular boundary with effective 
radius p(=a/sin? 6). The field (m = 0, 1, ---) is described exactly by Bessel functions 
and approximately by Airy functions. (b) Cross section in a local r, & plane. 


near the rod boundary. Observation of helicoidal rays in optics has 
been reported.’ 

Let us now assume that we have selected a convenient value for 6, 
perhaps 6 = 2.5°, and that we wish to define one or more channels in 
the azimuthal direction. This can be achieved with helicoidal sepa- 
rators,! or ridges, as shown in Fig. 1, that follow the path of the desired 
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whispering-gallery modes. It is clear, intuitively, that the optical power 
will tend to remain in the ridges. As the mode number increases, either 
in the azimuthal or radial direction, the modes occupy larger and 
larger volumes and eventually ‘‘spill out” of the ridges. Because there 
is maximum confinement in the ridge when @ = 7/2 and no confine- 
ment at all when @ = 0, it is plausible that only a single mode remains 
confined in the ridge for a proper choice of 6. The higher-order modes 
radiate away from the ridge along the boundary. They can be ab- 
sorbed easily without degrading the fundamental mode. In this paper, 
we justify the above intuitive arguments and show that strong dis- 
crimination against unwanted modes can indeed be obtained. 

The first step of the calculation is to obtain the propagation con- 
stants of whispering-gallery modes in circular cylinders in a con- 
venient form. This is done in Section II. In Section III, we investigate 
the case of helicoidal boundaries in the local mode approximation and 
obtain the design parameters. In Section IV, the case of small ridges 
is investigated with the help of a new perturbation method. 

The single-material helicoidal fiber discussed in this paper can be 
compared to the single-material ridge guide recently demonstrated’ 
and analyzed.‘> These two single-material fibers have features in 
common. The mode-selection mechanism rests on similar general 
principles. It can be ascribed to a coupling between ridges carrying 
trapped modes and two-dimensional substrates carrying radiation 
modes.® In the case of the ridge guide, the slab constitutes the two- 
dimensional substrate needed to ensure single-mode propagation. In 
the case of the helicoidal fiber, the dielectric rod itself can be con- 
sidered a two-dimensional mode sink because the whispering-gallery 
modes that it guides have a restricted thickness in the radial direction, 
as we have discussed before. In both cases, a good discrimination 
against high-order modes should in principle be obtained by increasing 
the distance between the absorbing elements and the ridges, because 
these elements are coupled through the radiation field rather than 
through evanescent waves. 

The theory given in this paper is applicable to purely metallic heli- 
coldal waveguides as well as to dielectric waveguides. The metallic 
helicoidal waveguide is attractive as a low-loss, multichannel, single- 
mode system for long-distance microwave communication. It can be 
compared to the groove guide’ shown in Fig. 4c. The metallic helicoidal 
waveguide has the advantage that TEM modes are absent. In the 
groove guide, any lack of symmetry between the two plates introduces 
a large loss through coupling to the (slower) TEM modes. This is, in 
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fact, also the reason for the superiority of the dielectric ridge guide’ 
over the metallic groove guide.’ 


Il. PROPAGATION OF WHISPERING-GALLERY MODES IN CYLINDRICAL 
SURFACES 


Let us consider a circular dielectric cylinder with radius a. For E 
modes, the ¢ and z components of the electric field have the form 


E(r, 6, 2) = J,(ur/a) exp Li(vd + kz], (1) 


where 
w= (kh? — ka’. (2) 


Assuming that the discontinuity in refractive index is sufficiently 
large, a condition well satisfied for quartz rods in air, the boundary 
condition at r = ais 

J(u) = 0, (3) 


because, for the type of mode considered, the field tends to vanish at 
the boundary. The zeros of J, are denoted u,z(v), m = 0,1, 2, °°. 
We introduce new coordinates &, ¢ in place of y, z (see Fig. 1) 


—=rcos6¢ — sin 6z, (4) 
¢ =coséz+rsiné¢, 
where 
6 = tan“ (2rr/p), (5) 


the quantity p, for “‘period,’’ being for the moment an arbitrary con- 
stant. The wave numbers I; I'y in the new coordinate system are 
related to v, kz by 


y=rcosél; +rsin6Ty, (6a) 
k, = —sin OT; + cos 6T;. (6b) — 
They are such that 
TeE+t Te = v6 + kez. (6c) 


The characteristic equations, (2) and (3), are ‘now written, using 
eqs. (6), 


(—sin 61; + cos 6T;)2a? + u(r cosdT; + rsinéTy) = ka®. (7) 


Equation (7) provides us with the desired relation between I; and r:. 
We wish to simplify this relation. Because we are interested in whisper- 
ing-gallery modes corresponding to large values of », we can use the 
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Table |— Values of b parameter in eq. (10) 


m bm [30 (2m + 3/4) /224]! (J.W.K.B.) 
0 1.85575 1.841 
1 3.24461 3.239 
2 4.38167 4.379 
3 5.387 5.385 


following approximation for un(v)® 
Un(v) = v + Bnv}, (8) 


where b,, 1S given in Table I. In the second column in Table I, the 
J.W.K.B. approximation for b,, obtained from simple ray optics con- 
siderations is given. As we can see, the error does not exceed 1 percent 
even for small m. 

We note further that a is not very different from r. Thus, we set 
a=r+t,t<r. Because we are considering waves that do not depart 
very much from the reference helicoidal path, I’; is very close to k, and 
the transverse wave number I; is small compared with I. Neglecting 
products of small quantities, (7) becomes 


Ms Te + Te = LL + 2t/p — 2bm(kp)*], (9) 


where we have introduced the reference helix curvature p = a/sin? 6. 
The term 2t¢/p expresses the fact that, at the reference radius r, the 
phase velocity is smaller than at the boundary with radius a. The term 
2b..(kp)—* results from the radial variation of the field. The larger the 
radial mode number m, the smaller the tangential wave number I. 
Note that the system is approximately isotropic. 


Ill. HELICOIDAL BOUNDARY 


In the previous section we have assumed that the boundary is a 
circular cylinder with radius a. We now assume that a is a function of 
£, but that it remains independent of ¢. By letting a vary with £, we 
generate a helicoidal surface. Azimuthal confinement of the whispering- 
gallery beams can be expected for various well-shaped profiles a(&). 
For simplicity, we assume here that a(£) = a, a constant, for —d < & 
< d, and a = r, where r denotes the reference radius, anywhere else 
in the period. A slightly tapered transition region is assumed. Mode 
mixing can therefore be neglected in the evaluation of the propagation 
constants of the modes m = 0, 1, 2, ---. A small amount of mode mix- 


1648 = THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1974 


ing is nevertheless needed for this mode-selection mechanism to 
operate.” 

From (9), the wave numbers for the fundamental mode (m = 0) 
and first-order mode (m = 1) in the ridge (unprimed number) and 
outside the ridge (primed number) are, respectively, 


To = LL + 2t/p — 2bo(kp)-*], (10a) 
T2 = kL — 2bo(kp)-*], (10b) 
Ty = [1 + 2t/p — 2bi(kp)-*], (10c) 
Tr? = kE1 — 2b1(kp)-*]. (10d) 


The axial wave numbers Tyo and I; for the two modes 0 and 1 are 
now obtained using the standard dielectric slab theory. If we normalize 
the axial wave number I'yo by defining 


Kk? = (Tyo — Ty)/(T5 — To) (11) 

and introduce the V parameter 
V2 = (13 — T)d?, (12) 
we have, for the modes m = 0, an explicit relation between V and K, 
V = {tan [K(1 — K*)-?] + nw/2}(1 — K?)-}3, (18) 
where n = 0, 2, 4, --- correspond to even modes and n = 1, 3, -:- to 


odd modes.* A similar relation holds for the modes m = 1, n = 0, 
1,2) an; 

The axial wave numbers Tym» of these various modes m,n are plotted 
in Fig. 3 as functions of the ridge width 2d, for \ = 1 wm, n = 1.45 
(quartz), a rod radius a = 50 wm, and a ridge height ¢ = 3.5 um. We 
have chosen @ = 2.5°, corresponding to a helix radius of curvature 
p = 25mm. Modes whose axial wave number is less than I'y (the wave 
number of the fundamental mode outside the ridge) suffer radiation 
losses. 

This figure clearly shows that only one mode (m = 0, n = O) is free 
of radiation loss if 2d is less than 14 um. For 2d = 14 um, the field of 
the fundamental mode decays in azimuth by a factor of 1/e at a 
distance 

Eo = (T% — To)-? = 6.3 um (14) 


on either side of the ridge. For two channels, the ‘‘wings” holding the 
* The mode number n should not be confused with the refractive index n. 
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Fig. 3—Axial wave numbers I'ymn for various m and n modes (r and £ coordinates). 
Radiation loss is suffered whenever T; < To. The radiation zone is shaded. This 
figure shows that single-mode propagation is possible if the ridge width 2d is less 
than 14 wm (for Ao = lum, n = 1.45, t = 3.5 um, a = 50 um, and 6 = 2.5°). 


rod in Fig. 1 are located at a distance 7a/2 = 80 um from the corruga- 
tion. At that distance, the field has decayed by a factor of more than 
10°. The fundamental mode therefore suffers negligible radiation loss 
(bending losses are not considered here). If we set the condition that 
£) be yo of wa/4, we obtain the approximate condition 6 d/a for 
single-mode low-loss operation. More detailed relations are given at 
the end of Section IV. 

The local mode theory used in this section is expected to be appli- 
cable when the ridge width 2d is large compared to the ridge height, ¢, 
and large compared to the wavelength, . In the next section, we find 
that a perturbation method applicable to small ridge areas (2 td) leads 
to almost identical conclusions. 


IV. LINE PERTURBATION OF SURFACE WAVES 


We give a general theory of the trapping of surface waves by rods 
of small cross section. This theory is then applied to helicoidal ridges 
of small cross section. 

Let us consider an isotropic surface, perhaps an inductive surface, 
supporting a plane wave with wave number k. Let us introduce a di- 
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Fig. 4—Various methods of introducing a line perturbation on a surface wave. 
(a) Reactive (e.g., corrugated) surface perturbed by a dielectric rod. (b) TEM waves 
perturbed by a dielectric slab. (c) ‘‘“Groove guide.’’? (d) Ridge guide considered in 
this paper. The torsion of the helicoidal motion is not essential. The radius of curva- 
ture p of the helix is the important parameter that determines mode selection. 


electric rod of very small cross section parallel to this surface, some 
distance away from it (Fig. 4a). Because the power carried by the plane 
wave is infinite, a straightforward application of the conventional 
perturbation method does not give any meaningful result. Therefore, 
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we shall proceed the other way around. We start from the perturbed 
state and assume that we know the propagation constant h > k. The 
wave is in that case confined transversely with a decay rate (h? — k?)}. 
We now “‘peel off”’ the rod and evaluate the successive perturbations 
until the perturbation resulting from the rod vanishes. By specifying 
that h — k in that limit, the initial value of h is obtained. 

For simplicity, this method is first explained for the case where 
€ & €o, the perturbed field being of the order of the unperturbed field. 
Let o be a parameter such that ¢ = 0 corresponds to the absence of the 
rod and o = 1 corresponds to the presence of the rod. Furthermore, o 
is so chosen that, in the perturbation formula 


dh/do = a(h? — k?)3, (15) 


ais a constant. This can be done because we have factored out a term 
(h? — k?)? inversely proportional to the power carried by the mode. 
(The distortion of the field in the close neighborhood of the rod does 
not contribute significantly to the total power, because of the large 
transverse extent of the field.) The ratio o is essentially the ratio of the 
present cross-section area of the perturbating rod to its original cross- 
section area. Integrating eq. (15) from « = 1 to o = 0, we obtain 


i "(2 — bdh = @ (16a) 


or 
h=kcosaXk(1 — a?/2). (16b) 


To clarify the significance of this result, let it be applied to a con- 
figuration where the exact solution is known. Consider two parallel 
perfectly conductive plates with spacing D carrying TEM modes, as 
shown in Fig. 4b. If we introduce a dielectric slab with ¢  e and 
width 2d, we obtain the so-called ‘“H-guide”’ configuration proposed by 
Tischer.’ (Note, however, that we consider here the H modes rather 
than the low-loss modes.) The parameter o = y/d, where y is shown 
in Fig. 4b, clearly satisfies the requirements set up above. The con- 
ventional perturbation formula (see, for example, Ref. 5, Part II, eq. 
(21), with E, YE, H, 2H, E' = E, Ht = —H) is 


Ah = tu [ (ce — «ES 7 fex as, (17) 
For our case, we obtain, taking into account the exp [— (h? — k?)#|y|] 
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dependence of the field on y, 
dh/do = (o/e€o)*?(h? — k?)#(€ — eo)wd = a(h? — k*)}. (18) 
Thus the constant a is, from (18), 
a = (nr? — I)kd, ~ (19) 


where ¢/éo = n?. Application of (16b) now gives the perturbed wave 
number h 
h = kLl + 3(r? — 1)?k?d?]. (20) 


The exact solution to the problem, for small (n? — 1)kd, is well known 
[see, for example, Ref. 5, Part II, footnote after eq. (11) ]. We have, 
with the approximation tan [(n? — 1)?kd)] & (n? — 1)#kd, 


A? — k2 = (n? — 1)2ktd?. (21) 


Equation (21) coincides with our perturbation result, (20), because 
h&k. Having satisfied ourselves with the validity of our perturbation 
technique, we apply it to a small wall perturbation. We assume that 
the case of quartz in air is the same as the case of a metallic boundary, 
except for the wavelength \/n replacing }. 

For a wall perturbation with cross-section area s, the perturbation 
formula is 


Nae OE ip il EX H-dS, (22) 


if the electric field is equal to zero. Note that h increases if the volume is 
increased, e.g., if we introduce corrugations in the wall. 

For H waves uniform along the y-axis (see Figs. 4c and 4d) (ZL, = £), 
we have, from Maxwell’s equations, 


H, = —(k/wpo)L, (28a) 
Hz = (topo) dE/dx. (23b) 


I 


Substituting in (22) we obtain the perturbation 
Ah = (s/2k)(dE/ax)2(h? — k?)} / [erac. (24) 


Defining co as x/t (see Figs. 4c or 4d), the constant a defined in (15) is 
found to be 


a = (s/2k) (dE/dx)? / i Fede, (25) 
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the derivative being evaluated at the first zero of E(x) for the mode 
m = 0, the second for the mode m = 1, and so on. 
For whispering-gallery modes, we have 


E = Ai(#), (26a) 
where Ai( ) denotes the Airy function and 
t = (2k?/p)tx — (2k?/p)-*(k? — h’), (26b) 
p being the boundary radius. Substituting in eq. (25), we obtain 
a = 2f,ktd/p, (27) 
where f, is a numerical factor 
fn = (dAi/dt)?-,, / : ™ AR()dt, (28) 
to, t1, -++, tm, +++ being the zeros of Ai(t). By numerical integration, we 
find 
fo = 0.981: -- (29a) 
fir = 0.955- °°. (29b) 


Thus, the change in propagation constant resulting from a wall 
deformation of area s = 2d is, from (27), 
hm — k = 4f%,k8s?/p?. (30) 


As long as the perturbation is small, hi remains smaller than I'y and 
only the mode m = 0 is free of radiation loss. For a sufficiently large 
perturbation, however, h1 may exceed I'y. Then the modes m = 0 and 
m = 1 are both free of radiation loss, that is, the system is no longer 
single-mode. The condition for the system to be single-mode is therefore 


hm —~ Ty <Toe —Ty, 


or 
Fmk3s?/p? < 2k(bo — b1)(kp)~#. (31) 


I'y, Py, and the constants bo, b1 were defined in Section III. The above 
condition can be written, using the values in Table I for bo, bi, 


k2s < 1.74 (kp)?. (32) 


If we take the limit d — 0 in the expressions given in Section III, we 
obtain instead 
ks < 1.68(kp)3, (33a) 
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which is very close to the perturbation result, (32). Thus, the local mode 
approach and the perturbation approach agree closely, not only in 
form, but also numerically. Because p } a/é, @ being the angle that 
the helix makes with the z axis, and 6 27a/p, p being the helix 
period, condition (33) for single-mode operation can be rewritten 


s < 0.012(d2p?/a)?. (33b) 


The extent £) in azimuth of the fundamental mode, defined as the 
1/e point of the field, &) = (hg — k?)-? & (2k)-*(hyo — k)-?, is, from 
(30) with m = 0, given by 

ko aww) p/ks. (34a) 


If we specify that the fundamental mode field has decayed by a factor 
of 10° at the ‘‘wings,”’ located, for two channels (see Fig. 1), a distance 
1a/2 away from the ridge, we must have & = (ra/2)/11.5. Introducing 
the helix period p, this condition for the fundamental mode to have 
small radiation loss can be written 


s > 0.005 (Ap/a)?. (34b) 


The condition for a single mode to propagate, (33b), and for the fun- 
damental mode to have small radiation losses, (34b), are consistent if 


p < 5a?/n. (35) 


For example, if a = 50 um, A = (1/1.45) um, according to eq. (35), 
the helix period, p, must be smaller than 18 mm. A period of 10 mm, 
for instance, would be quite adequate. Note that for such rather long 
periods the optical path is not significantly increased by the circular 
motion. Radiation into free space is negligible, as long as the medium 
surrounding the ridge is air. For mechanical reasons, however, we may 
want to use a material with lower index. In that case, radiation into 
the surrounding medium may be a limitation. 


V. CONCLUSION 


The single-material helicoidal fiber proposed earlier by the author 
has been shown to support only one mode, with low radiation loss, 
provided the following two conditions are satisfied : 


Helix period rod cross-section area/wavelength. 
Ridge area & rod cross-section area/70. 


More detailed calculations, similar in spirit to the ones given in 
Ref. 5, would be necessary to specify the magnitude of the radiation 
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losses and the exact value of the mode discrimination. The helicoidal 
fiber, like any single-mode fiber with large mode cross section, may be 
sensitive to bending losses. The bending loss is therefore another key 
point that needs to be investigated. 
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A Proposed Multiple-Beam Microwave 
Antenna for Earth Stations 
and Satellites 


By E. A. OHM 
(Manuscript received October 3, 1973) 


An offset Cassegrainian antenna with essentially zero aperture blockage 
ts expected to support closely spaced well-isolated beams suitable for earth 
stations and satellites. Each beam is fed with a separate small-flare-angle 
corrugated horn and has good area efficiency over a 1.75:1 bandwidth. 
Each beam also has good cross-polarization properties. The antenna is 
compact, and the design appears practical for a 4- and 6-GHz earth station, 
a 20- and 30-GHz earth station, and a 20- and 30-GHz satellite. 


I. INTRODUCTION 


Satellite communication systems with large capacities can be 
achieved if the satellites and earth stations are provided with multiple- 
narrow-beam antennas. The capacity is proportional to the number of 
satellites, and thus it is important to use as many as practical in the 
limited orbital space. A moderate number of the resulting closely 
spaced satellites can be served by a single antenna at each earth- 
station site if the antenna is patterned after the offset Cassegrainian 
antenna shown in Fig. 1. This design allows an orderly expansion in 
communication capacity by the addition of feed horns. Since only one 
antenna is needed at each site, the design also permits a large saving in 
earth-station costs. Good multiple-beam performance can be achieved 
across all up/down pairs of satellite frequency bands, including those 
well below 10 GHz. At 20 and 30 GHz, a large earth-station antenna 
with acceptable thermal and wind distortion is hard to achieve. How- 
ever, with the design outlined here, these problems can be largely over- 
come because the main reflector and subreflector can be fixed in posi- 
tion, thus allowing a stiffer structure. The steering of each beam is 
achieved by moving one of the feed horns, resulting in a steerable angle 
- sufficient for tracking near-synchronous satellites. 
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Fig. 1—Geometry of antenna and feed system. The feed horns are scaled for a 
3m 20- and 30-GHz satellite antenna. For a 30m 4- and 6-GHz earth-station antenna, 
L and 2a are half as large as shown. 


The offset Cassegrain is also appropriate for use aboard a satellite 
because all beams, including those moderately far off-axis, have high 
area efficiencies and low side-lobe levels. However, good results on a 
satellite are restricted to bands well above 10 GHz because the antenna 
size is limited by the launch vehicle. 

It has been previously shown that a multiple-beam antenna can be 
achieved in a variety of ways,?-> where each approach has emphasized 
one feature desired in a practical antenna. By combining several of these 
with a corrugated feed horn’ and an enlarged subreflector, it is possible 
to achieve a compact antenna with exceptionally good multiple-beam 
characteristics. In particular, in the offset Cassegrainian antenna 
shown in Fig. 1: 


(t) An offset design essentially eliminates beam blockage, thus 
allowing a significant reduction in side-lobe level.’ This, in 
turn, results in higher isolation between beams and a lower 
antenna noise temperature. 
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(it) The Cassegrainian feed system is compact and has a large 
focal-length-to-diameter (F/D) ratio. The large F/D ratio 
reduces aberrations to an acceptable level, even when a beam 
is moderately far off-axis. 

(aii) A corrugated feed horn is essentially a Gaussian-beam launcher® 
and, as such, it can be used to achieve beams with low side-lobe 
levels. The corresponding feed-horn aperture® is small enough 
to allow the beams to be closely spaced. 

(zv) An enlarged subreflector, as indicated by the dashed line in 
Fig. 1, allows the main reflector to be properly illuminated, 
even when a beam is moderately far off-axis. 


These features can be achieved over a wide range of antenna param- 
eters. Using the results developed here, the sample calculations sum- 
marized in Table I show that (7) the off-axis beam angles are practical, 
(it) the coma aberration is small, (zzz) the feed-horn dimensions are 
reasonable, and (7v) the isolations between beams are large. 


Il. OFF-AXIS DESIGN CRITERIA 


Consider a parabolic reflector that is circularly symmetric and 
illuminated with a feed at its prime focus. If the aperture is large in 
wavelengths and the prime focal-length-to-diameter, F’/D, ratio is 2 
or more, it is well-known that a beam can be scanned over tens of beam- 
widths by lateral displacement of the feed.2 A Cassegrainian antenna 
normally has a secondary focal length F larger than F’, and thus a 
larger F'/D ratio. Consequently, a scanned beam can also be obtained 
by displacing a feed at the secondary focus.°® For the small off-axis angle 
reported in Ref. 5 (4 beamwidths = 0.9°), the on-axis and off-axis 
beam characteristics are nearly identical, and the residual differences 
can be readily explained in terms of an equivalent parabola.*:* The 
equivalent parabola, in turn, has characteristics identical to those of a 
prime-focus parabola. Consequently, the prime-focus theory? can be 
used to predict the off-axis equivalent-parabola results, and thus the 
Cassegrainian results. This chain of reasoning assumes that the equiv- 
alent-parabola concept is valid for the antenna parameters (F’/D and 
F/D ratios and off-axis angles) considered here. In support of this 
assumption, it is of interest to note that the chief off-axis beam param- 
eter of a prime-focus parabola, namely,? 


D 2 
AB 


1+ 0.02( z 
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where N is the off-axis angle in half-power beamwidths, has a value in 
Ref. 5 of about 30. Thus, the equivalent-parabola concept is valid for 
X’ values at least through 30. Furthermore, the known results indicate 
that the region of validity can be extrapolated to X’ values well 
beyond 80. In particular, Ref. 5 shows that the coma lobe, which is the 
first side lobe aimed toward the on-axis direction, increases very slowly 
as a function of off-axis beam angle. From Ref. 2, it is also known that 
an increase in coma-lobe level is a sensitive leading indicator of serious 
aberration problems, and that X’ increases rapidly with coma-lobe 
level. It follows that X’ in Ref. 5 can be much larger than 30 before a 
larger increase in coma-lobe level signals the onset of serious aberra-. 
tions. The upper limit of X’ should and can be calculated but, in the 
meantime, some of the results in Table I include an engineering judg- 
ment that the equivalent-parabola concept is valid for X’ values 
through 45. Even if the upper limit turns out to be somewhat less, the 
offset Cassegrain can still support a respectable number of multiple 
beams, i.e., for X’ = 30, the number of 1°-spaced beams from the 
earth-station antenna of Table I is 7 rather than 11. 

An important parameter of an off-axis beam is the third-order phase 
error across the beam at the antenna aperture. This error, Ad, increases 
the level of the coma lobe.? For a symmetrical parabola illuminated 
with a feed displaced laterally from the prime focus, the peak value of 
A@ at the edge of the aperture can be calculated from eq. (12) of 
Ref. 2. Similarly, when an offset parabola (as in Fig. 1) is illuminated 
with a feed displaced laterally from the prime focus (in the 2 direction 
in Fig. 1), the maximum third-order phase error, Ad’, which occurs at 
the side edge of the aperture, can be calculated from” 


, _ 20 F’ sind 1 
Ao = 32x (FD 1+ (2/2F’ @) 


where F’ is the prime focal length, 6 is the off-axis angle of the beam, D 
is the diameter of the offset aperture, and Y2 (see Fig. 1) is the offset 
height of the aperture. Equation (2) assumes that the feed is also dis- 
placed slightly in the longitudinal direction (the — z direction in Fig. 1) 
to cancel field curvature. 

Comparison of eqs. (12) and (13) of Ref. 2 shows that A¢’ is pro- 
portional to X’. Noting that Ad’ in (2) is defined in terms of the aper- 
ture diameter, D, independently of whether the aperture is centered or 
offset, it follows that D in eq. (1) should be interpreted in the same way, 
i.e., it is the diameter of the offset aperture, D, and not the diameter of 
the aperture of the full parabola (8/3 D in Fig. 1). 


1660 THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1974 


If the prime-focus feed illuminating the offset parabola is replaced 
with a Cassegrainian feed system, as in Fig. 1, and the equivalent- 
parabola concept is valid, F’ in (1) and (2) can be replaced with the 
Cassegrainian focal length F’. In Fig. 1, F is the distance Z times the 
ratio of centerline-ray heights where they intercept the main and sub- 
reflector heights, ie., F = Z(Y2/Y1). For the antenna parameters 
listed in Table I, the values of A¢ calculated from (2) are substantially 
less than 90°. For these values, the first side lobe, or coma lobe, is 
increased in amplitude, but the side lobes which are positioned further 
out, i.e., those that determine the minimum spacing of well-isolated 
beams, are virtually unchanged. Accordingly, in the remainder of this 
paper, it is assumed that A¢ is zero. The corresponding values of X, 
which are calculated from (1) after replacing F’ by F, are found to be 
4.5 or less. From the plots given in Ref. 2, the off-axis and on-axis 
beam characteristics are essentially identical for these values of X. 


IH. BEAM SPACING 


Suppose the amplitude distribution across an unblocked aperture 
is that of a dominant-mode Gaussian beam, that the amplitude at the 
edge is truncated at the —15-dB point, and that the phase front is 
uniform. The envelope of the resulting radiation pattern is shown in 
Fig. 2. For the offset Cassegrain shown in Fig. 1, the above amplitude 
and phase distribution can be achieved by placing a corrugated feed 
horn® at the secondary focal point, f. Comparison of Dragone’s results? 
with the standard Gaussian-beam equations" shows that the radius of 
the beam, w, at the —8.686-dB (or 1/e amplitude) point, is related to 
the feed-aperture radius, a, by 


w = 0.647 a. (3) 


The comparison also shows that the phase-front radius is equal to the 
slant length of the feed-horn, L. Using Gaussian-beam equations," the 
beam parameters in any other region in the feed system can be cal- 
culated. One result is that the required feed-horn length, L, can be 
found from the half-angle, y, subtended at the focus f by the subre- 
flector, and the illumination taper, 7’, in dB, at the edge of the sub- 
reflector. 


L = 0.076 5 Tan. (4) 


Equation (4) includes the feed-horn design criterion® 


CAL = 1, (5) 
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Fig. 2—Estimate of the side-lobe envelope resulting from a Gaussian illumination 
taper truncated at the —15 dB point, courtesy of T. S. Chu. 


where, for a 1.75:1 bandwidth, is specified at the low end of the fre- 
quency range. Equation (4) is strictly valid only when y > d/Dseup, 
where D,yp is the diameter of the subreflector. For an equivalent 
parabola with focal length F’,’ it can be shown that the y criterion is 
automatically satisfied when the F'/D ratio is less than 5. 

The corresponding feed-horn aperture radius, a, is found by solving 
a?/XL = 1 for a, and substituting LZ from (4): 


a = 0.275 — Van. (6) 


Suppose the antenna shown in F es 1 has a diameter-to-wavelength 
ratio, D/X, in the hundreds, an equivalent focal length, F, and an F/D 
ratio larger than 2. Then if a second feed-horn is placed adjacent to 
the on-axis feed, the second beam will be aimed in an off-axis direction, 
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6, = 2a/F. Inserting a from Eq. (6) and noting’ that y = D/2F, 


A, =— Lit VT ap. (7) 
Inserting (7) into the parameter on the abscissa of Fig. 2, the value of 
u for contiguous corrugated feed horns is 


us = 3.46VT ap. (8) 


For T = 15 dB, u = 13.4. From Fig. 2, the —3-dB beamwidth is 
3.62; thus, uw: corresponds to 13.4/3.62 = 3.7 beamwidths. For 
ui = 13.4, Fig. 2 shows that the side-lobe envelope level is —37 dB; 
this is approximately equal to the isolation of two beams spaced 61 
degrees apart. The isolations for typical beam spacings are included in 
Table I. In the earth-station example, the minimum beam spacing is 
0.6°, but the corresponding isolations, 37 and 43 dB at 4 and 6 GHz, 
respectively, are too small for allowable adjacent-satellite interfer- 
ence.!2 These isolations can be increased to 45 and 49 dB, respectively, 
by increasing the beam (and satellite) spacing to 1°. The increased 
beam spacing also allows room between feed horns, so they can be 
moved individually to track small errors in satellite positioning. 


IV. AREA EFFICIENCY 


Suppose an off-axis plane wave is incident on the main-reflector aper- 
ture shown in Fig. 1. The rays intercepted and reflected by the main 
reflector are displaced laterally with respect to those from an on-axis 
beam. But if the subreflector surface is sufficiently broadened, each of 
these rays will be intercepted and focused to a new point that is dis- 
placed laterally with respect to focal point f. To accommodate off-axis 
beams in the horizontal plane, the subreflector width is increased ; 
similarly, for beams in the vertical plane, the height is increased, as 
indicated by the dashed line in Fig. 1. 

The lateral displacement of the focus, corresponding to an off-axis 
beam at an angle @, is equal to @ times the equivalent focal length, F. 
It is assumed that a separate corrugated feed horn is optimally posi- 
tioned about the focus of each off-axis beam, i.e., each feed is pointed 
such that the original on-axis amplitude distribution is maintained 
across the main-reflector aperture, and each feed is longitudinally 
positioned to minimize aberrations. 

The phase center of a corrugated horn can be calculated as a func- 
tion of frequency.® This in turn allows the longitudinal position of the 
feed to be optimized for broadband performance. 
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Assuming the foregoing precautions are observed, each beam of the 
antenna in Fig. 1 has a computed gain about 1 dB less than that ob- 
tainable from an aperture with a uniform amplitude distribution. The 
underlying reasons for the good area efficiency, 80 percent, are (2) the 
main reflector does not have to be enlarged to accommodate off-axis 
beams, and (72) the F/D ratio of a Cassegrainian antenna is fairly large. 


V. POLARIZATION CROSS-COUPLING 


T.S. Chu and R. H. Turrin have shown that the cross-coupling of an 
offset reflector is a function of (2) the angle between the feed axis and 
the reflector axis and (72) the half-angle subtended at the focus by the 
reflector.!® In an offset Cassegrainian antenna with a moderate F/D 
ratio, these angles are fairly small; thus, the cross-coupling is very 
small. In particular, in Fig. 1, y = 14° and y = 8.5°. For linearly 
polarized excitation, the cross-polarized lobes have a peak value of 
—45 dB. It is anticipated that, in beams with small off-axis angles, as 
in Table I, the cross-coupling will be about the same. 


Vi. MULTIPLE-BEAM ANTENNA PARAMETERS 


The off-axis beam parameters and corresponding feed-horn dimen- 
sions of an offset Cassegrainian antenna fed with corrugated horns 
can be calculated once the main-aperture diameter and operating 


Table |— Multiple-beam antenna parameters 
Earth Station Satellite at 
at 4/6 GHz 20/30 GHz 
Aperture diameter, D 30 meters 3 meters 
Wavelength, A 7.5 em/5 cm 1.5 cm/1 cm 
Beamwidth, 8 0.165°/0.11° 0.33°/0.22° 
Primary focal length, F’ 30 meters 3 meters 
Off-axis beam angle, 0 5° 4° 
No. of beamwidths, N = 0/8 30/45 12/18 
Off-axis parameter, X’ 30/45 12/18 
Main-reflector offset, Y2 25 meters 2.5 meters 
Subreflector offset, Y1 5 meters 0.5 meter 
Equivalent focal length, F 100 meters 10 meters 
F/D ratio 3.33 3.33 
Coma aberration, Ad 18°/27° 14°/21° 
Off-axis parameter, X 3.0/4.5 1.2/1.8 
Feed-horn length, L 3.8 meters 76 cm 
Feed-horn diameter, 2a 1.03 meters 20.5 cm 
Beam spacing 6; 0.6° 1,2° 
Isolation at 6: spacing 37 dB/43 dB 37 dB/43 dB 
Isolation at 1° spacing 45 dB/49 dB — 
No. of available beams 16 (in a row) 18 (within U. 8.) 


1664 THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1974 


wavelengths are specified. Typical results for an earth-station antenna 
at 4 and 6 GHz and a satellite antenna at 20 and 30 GHz are givenin 
Table I. Similar results for other diameters and wavelengths can be 
found by following the text and performing the calculations in the 
order listed in Table I. 


Vil. CONCLUSIONS 


An offset Cassegrainian antenna fed with corrugated horns is ex- 
pected to have well-isolated multiple beams that are broadband and 
dual-polarized. The antenna has good area efficiency and is relatively 
compact. This combination of properties makes the antenna well- 
suited for earth stations and satellites. 
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Limiting the Propagation of Errors in One-Bit 
Differential CODECs 


By J. C. CANDY 
(Manuscript received March 27, 1974) 


An improved delta modulator is described that communicates to the 
receiver changes in the magnitude of the signal instead of changes in the 
amplitude. It is shown that propagation of errors in this system is limited, 
even when digital accumulators without leakage are used for integration. 


I. INTRODUCTION 


A major factor in the design of differential copEcs is achieving rapid 
recovery from transmission errors. Traditionally,! slow leakage in 
analog integrators has allowed error signals to decay with a defined 
time constant. We describe here another method for curtailing the 
propagation of errors, one that is well suited for use with digital 
integration. Digital integration? is attractive for differential copEcs 
constructed of integrated circuits. It allows the conversion to analog 
format to be left to a late stage of signal processing, thereby avoiding 
the need for high-grade amplifiers and carefully matched pulses, and 
enables signal amplitudes to be companded by appropriate design of 
the digital-to-analog (D/A) conversion network. 

Introducing slow linear leakage into digital integrators is incon- 
venient. Indeed, leakage is often undesirable because perfect integra- 
tion has an advantage of its own, once the effect of circuit and trans- 
mission faults has been contained. 

Reference 2 demonstrates how a periodic clamp or an overload of 
the integrator corrects errors, but neither of these methods is suitable 
for use with speech signals. The proposed method is a simple one; 
instead of signaling changes of amplitude, the coder merely signals 
changes of magnitude. This small modification causes errors to fall 
quickly to zero. It has application to a variety of copEcs, being especi- 
ally well suited for delta modulators and related 1-bit copEcs used for 
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transmitting speech. Application to multibit differential copEcs is 
somewhat restricted. 


ll. DIFFERENTIAL CODECS INCORPORATING DIGITAL INTEGRATION 


There are several ways of providing digital integration for differ- 
ential copEcs; Fig. 1 shows three methods. The first is a delta modu- 
lator that uses an up-down counter.*~4 The second is a multilevel differ- 
ential copEc that uses a conventional accumulator,’ comprising an 
adder and a register. The third is an interpolative coprc® that uses a 
bidirectional shift register. If this register is fed 1’s at the lower input 
and 0’s at the upper, the entire contents shift up or down during each 
cycle, in response to the output of the threshold decision circuit. 
Details of this third circuit will be discussed in a later paper. 

It is clear that a transmission error or inaccurate start-up procedure 
in any of these circuits can result in permanent mistracking of the trans- 
mitting and receiving integrators. Such mistracking may not be very 
serious for uniformly quantized signals, but when the D/A con- 
version levels are companded, mistracking would be catastrophic. 
Logarithmically companded magnitudes are useful for transmitting 
speech, and they are easily obtained by means of the circuit in Fig. 1c. 

Figure 2 shows a copEc modified to communicate changes of mag- 
nitude: Whenever the most significant bit in the counter is 0, the code 
is inverted for transmission. The code is reinverted at the receiver 
under control of the most significant bit of the receiving counter. 
Notice that for symmetric signals, such as speech, the most significant 
bit indicates the polarity of the signal, and the remaining bits describe 
its magnitude, negative magnitudes being in 2’s complement format. 

The circuits in Fig. 1 do not explicitly show the means for protecting 
the digital integrators from overflowing, but such protection is needed 
to prevent serious distortion of very large signals. Figure 2 incorporates 
two gates, A and B, that detect when the counter is full or empty. 
Their outputs inhibit threshold decision that would cause over- or 
underflow. 

Figure 3 contrasts the responses of the circuits in Figs. la and 2 
when transmission errors occur in the eighth and the nineteenth cycle. 
Permanent mistracking can occur when amplitude changes are sig- 
naled, but the errors quickly disappear when magnitude changes are 
signaled. The speed of recovery depends on the frequency of zero 
crossings of the input signal: A single positive error is wiped out when 
the signal would have crossed through zero going positively, or when 
the erroneous signal crosses zero going negatively. Zero crossings in 
the other direction correct negative errors. 
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Fig. 1—Differential copecs employing digital integration. (a) Steps are counted 
in a delta modulator. (b) Steps are accumulated in a differential copEc. (c) A shift 
register stores 1’s in an interpolative coDEc. 
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Fic. 3—copEc responses. (a) Representative input and output. (b) Delta modulator 
code. (c) Delta modulator code with two errors. (d) Delta modulator responses. (e) 
Code that signals magnitude change. (f) Code with two errors. (g) Output waveforms. 


An alternative circuit arrangement for signaling magnitude is shown 
in Fig. 4. Here the exclusive Nor gate that inverts the code is placed 
in the feedback loop of the coder, and the counter holds code that 
describes only the magnitude of the signal. Polarity is defined by the 
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Fig. 4—Another circuit that signals magnitude. 
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state of a toggle circuit, 7. The AND gate A prevents overflow of the 
counter. The or gate B detects when the counter is empty, and when 
a further decrease in magnitude is demanded, it diverts the clock from 
the counter to changing the state of the toggle. When the toggle in- 
dicates a negative polarity, its output inverts both the code and the 
output of the D/A network, thereby preserving negative feedback. 
This circuit arrangement is often easier to implement for companded 
codes than is the arrangement in Fig. 2. The methods illustrated in 
Figs. 2 and 4 can be used to modify any coprc in Fig. 1. For the 
application to multilevel coding, the exclusive Nor gate should be 
used to invert only the polarity bit, the magnitude in the transmitted 
bit remaining unchanged. 


Ill. PRESERVING SIGNAL POLARITY 


A liability of signaling magnitudes is the inability to continuously 
inform the receiver of signal polarity. We now demonstrate that trans- 
mission errors cannot cause an inversion of the signal. 

The output signal at any time has one of a set of discrete values. In 
Fig. 5a, these values have been numbered in order, as have the cycle 
times. The graph starts at cycle 1 on 5, an odd-numbered level; there- 
after, the signal always has an odd-numbered value on an odd-num- 
bered cycle, even after a transmission error has occurred. An inversion 
of polarity, illustrated in Fig. 5b, requires that the signal assume even- 
numbered values at odd-numbered cycles. This can occur only after 
incorrect start-up or loss of synchronization; once polarity is estab- 
lished, it will be preserved as long as the system remains synchronized, 
transmission errors having only a transitory effect. 

Notice that the method used for avoiding overflow of the counters 
in Figs. 2 and 4 preserves the timing of the system in a way that pre- 
vents an inversion of polarity when error causes an overload. It also 
helps to eliminate errors in a way that is analogous to the method 
described in Ref. 2. 

The above discussion applies whether or not the output levels are 
companded, but it does assume that the signal steps up or down by 
one level at a time. Multilevel differential copEcs permit the signal to 
step through several levels at a time; they can lose their hold on 
polarity after an error unless step sizes are chosen with care. Spe- 
cifically, the sum of any odd number of steps should never be equal to 
the sum of an even number of steps. Regardless of the manner in which 
the steps are chosen, multilevel steps tend to increase the time taken 
to wipe out errors, as is illustrated in Fig. 6. These observations indi- 
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Fig. 5b—An inversion of polarity caused by incorrect start-up. 


cate that signaling changes of magnitude is best suited for 1-bit cod- 
ing, stepping one level at a time, but the levels themselves may be 
companded. 


IV. CONCLUSION 


Codes that signal changes in magnitude direct the output to step 
either toward or away from zero amplitude. We have seen that such 
coding wipes out the effect of a transmission error when next the signal 
passes zero in an appropriate direction. 

Use of zero as the reference is appropriate for coding speech signals, 
because they frequently pass through zero amplitude. In general, the 
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Fig. 6—Response of a multilevel differential coprc having step sizes +1, +3, +5. 
(a) Ordinary differential code. (b) The code that signals changes of magnitude. (c) 
The code with an error. (d) The correct and erroneous responses. 


reference can be any value that does not correspond exactly to a 
possible output level. For coding video, the reference is best set at an 
amplitude corresponding to medium brightness so that errors can be 
wiped out quickly. 

Conventional differential copEcs signal changes in amplitude; they 
may be regarded as having a reference set outside the signal range and 
do not use the error correcting properties associated with an internal 
reference. 
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